An Experience Report on Analyzing Industrial Software Systems Using Code Clone Detection Techniques

Size: px
Start display at page:

Download "An Experience Report on Analyzing Industrial Software Systems Using Code Clone Detection Techniques"

Transcription

1 An Experience Report on Analyzing Industrial Software Systems Using Code Clone Detection Techniques Norihiro Yoshida (NAIST) Yoshiki Higo, Shinji Kusumoto, Katsuro Inoue (Osaka University)

2 Outline 1. What is a code clone? 2. Discussions on the harmfulness of code clone 3. Importance of sharing industrial experiences with clone 4. Industrial application of clone analysis } Analysis tools } Result 5. Summary

3 What is a code clone? } A code fragment that has identical or similar code fragments to it in source code. } Introduced in source program by various reasons such as reusing code by `copy-and-paste Code Clone 3

4 Discussions on the harmfulness of code clone (Opponent) There have been numerous discussions: } Cloning opponent: clone should be avoided because it makes software maintenance difficult. } Book on programming practice } Research papers that is a little bit less than state-of-the-art Bug is found Code Clone The other Bug clones is should be found inspected.

5 Discussions on the harmfulness of code clone (Moderate & Proponent) } Moderate: Clone is unavoidable when a language lacks suitable modularization mechanism to eliminate it. } e.g., it is difficult to merge code clones into a function in the case that the identifiers are unmatched. } Clone proponent: Clones do not often cause bugs, so leave it be. } According to Rahman s study [1] of OSS, there is no significant relationship among locations of bug and clones. [1] Rahman, et al., Clones: What is that Smell?, MSR 2011

6 Importance of sharing experiences with cloning } Which is truth of code clones? } There is no conclusion currently. } Probably it depends on the context of cloning [2]. } Sharing experience with cloning is a promising way for easy identification of harmful and harmless clones. } Software engineering community has to report experience with clone detection and analysis. [2] Kapser, et al., "Cloning considered harmful" considered harmful: patterns of cloning in software, Empirical Software Engineering, 2008.

7 Researches on cloning in industry } Much research have been done on } Automatic clone detection } Analysis of code clones in OSS } On the other hand, quantity of reports on cloning in industry has been lacking. } Ratio of clones to whole source code is higher in industry than in OSS } Rather than in OSS, clone causes a problem in industry. It is needed to report an industrial experience with clone analysis.

8 Overview of industrial case study 1. Investigated an industrial software in terms of the following points by clone analysis technique. A. Is there significant difference in clones between the ends of the unit testing and the combined testing? B. Where clones are concentrated in the source code? C. What sort of characteristic clones are involved in the source code? 2. Interviewed developers for detected clones

9 Target software project } Japanese governmental project } Software system for traffic infrastructure } Source code } Approximately 100,000 LOC, and increased by 20 thousands after the unit test. } Main language is C/C++ } Organization } } 5 vendors, each of which was assigned for a subsystem. 1 project manager from a company different from the vendors

10 Tools for clone detection & analysis } Clone detection tool : CCFinder [3] } Detection of lexically-similar code clones based on the identification of identical token sequences in source code } Code clone analyzer : Gemini [4] } Scatter plot } Metrics for extracting clones [3] T. Kamiya, et al.: "A multilinguistic token-based code clone detection system for large scale source code, IEEE TSE, [4] Y. Ueda, et al.: Gemini: Maintenance Support Environment Based on Code Clone Analysis, METRICS 2002.

11 Token-based clone detection tool : CCFinder Detection of identical token sequences in source code Source files 1. static void $ $ foo foo() ( ( ) ) throws $ RESyntaxException { $ $ { String a a{ 2. [ String ] = a[] new $ = $ String new [ ][ String { ] $ { "123,400" [] $ { "123,400",, "abc", "orange 100" }; 3. "abc" org.apache.regexp.re, "orange 100" } ; pat. apache = new. org.apache.regexp.re("[0-9,]+"); regexp. RE pat org. apache. regexp 4. int $ sum $ = new 0;. RE $ ( "[0-9,]+" $ ) ; int $ sum $ = $ 0 5. for (int i = 0; i < a.length; ++i) ; for ( int $ $ i = $ 0 ; $ i < 6. if (pat.match(a[i])) a $. length $ ; ++ ; ++ $ i ) ) if if( ( $ pat 7. match sum ( += a [ Sample.parseNumber(pat.getParen(0));. $ ( $ [ $ i ] ] ) ) ) ) $ sum 8.. += System.out.println("sum Sample $. $. parsenumber ( $. $ ( ( pat $ =. " getparen + sum); ( ( 0 9. } ) ) ; System $. $. out. $. println ( $ ( "sum = " 10. static + sum $ void ) ) ; ; goo(string } static void $ $ [] goo a) ( ( throws $ String RESyntaxException { 11. a $ RE [ exp ] ) = throws new RESyntaxException $ RE("[0-9,]+"); { $ $ = { RE exp = RE "[0-9,]+" ) ; int sum = new int $ sum ( $ = ) 0; ; $ $ = $ ; ( int $ $ i = $ 0 ; $ i < 13. for (int i i = 0; i i < a.length; ++i) a $. length $ ; ++ ; ++ $ i ) ) if if( ( $ exp 14. if if (exp.match(a[i])). match $ ( ( $ a [ [ $ i ] ] ) ) ) ) $ sum 15. sum += parsenumber(exp.getparen(0)); += parsenumber $. parsenumber $ ( ( $ exp. (. $ exp getparen (. $ getparen () 0) ) ( ) 0 ) ) 16. ; System.out.println("sum $. $.. $. ( $ ( + "sum $ = "" + sum); 17. } ) ; } ) ; } Lexical analysis Token sequence Transformation Transformed token sequence Match detection Clones on transformed sequence Formatting Clone pairs

12 Code clone analyzer : Gemini Scatter Plot } Visually shows where code clones are D1 D2 F1 F2 F3 a b c a b c c c a b d e f a b c F4 c d e f } Both the vertical and horizontal axes represent the token sequence of source code } The original point is the upper left corner } means that corresponding two tokens on the two axes are the same D1 D2 F1 F2 F3 F4 a b c a b c c c a b d e f a b c c d e f F1, F2, F3, F4 : files D1, D2 : directories : matched position detected as a practical code clone : matched position detected as a non - interesting code clone

13 Code clone analyzer : Gemini Clone/File Metrics } Example of clone metrics } } LEN(S): the average length of code fragments (the number of tokens) in clone set S } clone set : a set of code fragments, in which any pair of the code fragments is a code clone NIF(S): the number of source files including any fragments of S } Example of file metrics } } ROC(F): the ratio of duplication of file F } if completely duplicated, the value is 1.0 } if not duplicated at all, the value is 0.0 NOC(F): the number of code fragments of any clone set in file F

14 Amount of Code Clones in Subsystems Company ID After unit testing # clones Duplicated ratio After combined testing # clones Duplicated ratio V % % W % % X 4,483 55% 4,768 51% Y 6,747 43% 7,628 46% Z 2,450 56% 2,505 56%

15 Amount of Code Clones in Subsystems Company ID After unit testing # clones Duplicated ratio After combined testing # clones Duplicated ratio V % % W % Clones had increased % X during 4,483 combined 55% testing 4,768 51% Y 6,747 43% 7,628 46% Z 2,450 56% 2,505 56%

16 Scatter Plot (Company Y) after combined testing after unit testing after combined testing after unit testing D E A B C The parts D and E imply the creation of clones after the unit testing. Interview The developers insist that they added trusted library that has used in many products.

17 Scatter Plot (Company Y) after combined testing after unit testing after combined testing D A B C The part A treats geographical information of several types of vehicles. The code for the types are mostly cloned. The part B involves statements for building SQL queries. after unit testing E The part C involves initialization and finalization for a certain feature.

18 Example of detected clone Clone metrics-based analysis } Longest clones } A pair of 154 lines clones between the two files } Implications of copy-and-paste and forgetting modification /*...XX.. *// /*..XX.. *// Implication of forgetting modification to YY Implication of forgetting modification to YY void XX () void XX () AAXXBB.cpp AAYYBB.cpp

19 Example of detected clones File metrics-based analysis } Source file containing the maximum number of clones 358 clones } Most duplicated pair of source files 96% tokens are duplicated Interview Developers had expected this duplication since design phase.

20 Summary & Future work } Summary } Discussed the importance of sharing industrial experiences with clone analysis } Presented industrial application of clone analysis } Many characteristic clones were extracted } According to interviews for some of the extracted clones, the developers expected the existence of clones. } Future work } Conduct the further analysis for determining whether harmful clones or not

1/30/18. Overview. Code Clones. Code Clone Categorization. Code Clones. Code Clone Categorization. Key Points of Code Clones

1/30/18. Overview. Code Clones. Code Clone Categorization. Code Clones. Code Clone Categorization. Key Points of Code Clones Overview Code Clones Definition and categories Clone detection Clone removal refactoring Spiros Mancoridis[1] Modified by Na Meng 2 Code Clones Code clone is a code fragment in source files that is identical

More information

Code Clone Analysis and Application

Code Clone Analysis and Application Code Clone Analysis and Application Katsuro Inoue Osaka University Talk Structure Clone Detection CCFinder and Associate Tools Applications Summary of Code Clone Analysis and Application Clone Detection

More information

On Refactoring for Open Source Java Program

On Refactoring for Open Source Java Program On Refactoring for Open Source Java Program Yoshiki Higo 1,Toshihiro Kamiya 2, Shinji Kusumoto 1, Katsuro Inoue 1 and Yoshio Kataoka 3 1 Graduate School of Information Science and Technology, Osaka University

More information

Refactoring Support Based on Code Clone Analysis

Refactoring Support Based on Code Clone Analysis Refactoring Support Based on Code Clone Analysis Yoshiki Higo 1,Toshihiro Kamiya 2, Shinji Kusumoto 1 and Katsuro Inoue 1 1 Graduate School of Information Science and Technology, Osaka University, Toyonaka,

More information

On Refactoring Support Based on Code Clone Dependency Relation

On Refactoring Support Based on Code Clone Dependency Relation On Refactoring Support Based on Code Dependency Relation Norihiro Yoshida 1, Yoshiki Higo 1, Toshihiro Kamiya 2, Shinji Kusumoto 1, Katsuro Inoue 1 1 Graduate School of Information Science and Technology,

More information

The Reverse Engineering in Oriented Aspect Detection of semantics clones

The Reverse Engineering in Oriented Aspect Detection of semantics clones International Journal of Scientific & Engineering Research Volume 3, Issue 5, May-2012 1 The Reverse Engineering in Oriented Aspect Detection of semantics clones Amel Belmabrouk, Belhadri Messabih Abstract-Attention

More information

Classification of Java Programs in SPARS-J. Kazuo Kobori, Tetsuo Yamamoto, Makoto Matsusita and Katsuro Inoue Osaka University

Classification of Java Programs in SPARS-J. Kazuo Kobori, Tetsuo Yamamoto, Makoto Matsusita and Katsuro Inoue Osaka University Classification of Java Programs in SPARS-J Kazuo Kobori, Tetsuo Yamamoto, Makoto Matsusita and Katsuro Inoue Osaka University Background SPARS-J Reuse Contents Similarity measurement techniques Characteristic

More information

Extracting Code Clones for Refactoring Using Combinations of Clone Metrics

Extracting Code Clones for Refactoring Using Combinations of Clone Metrics Extracting Code Clones for Refactoring Using Combinations of Clone Metrics Eunjong Choi 1, Norihiro Yoshida 2, Takashi Ishio 1, Katsuro Inoue 1, Tateki Sano 3 1 Graduate School of Information Science and

More information

EVALUATION OF TOKEN BASED TOOLS ON THE BASIS OF CLONE METRICS

EVALUATION OF TOKEN BASED TOOLS ON THE BASIS OF CLONE METRICS EVALUATION OF TOKEN BASED TOOLS ON THE BASIS OF CLONE METRICS Rupinder Kaur, Harpreet Kaur, Prabhjot Kaur Abstract The area of clone detection has considerably evolved over the last decade, leading to

More information

ICECCS 2017 Exploring Similar Code

ICECCS 2017 Exploring Similar Code ICECCS 2017 Exploring Similar Code - From Code Clone Detection to Provenance Identification - Katsuro Inoue Osaka University 1 Software Engineering Laboratory, Department of Computer Science, Graduate

More information

Rearranging the Order of Program Statements for Code Clone Detection

Rearranging the Order of Program Statements for Code Clone Detection Rearranging the Order of Program Statements for Code Clone Detection Yusuke Sabi, Yoshiki Higo, Shinji Kusumoto Graduate School of Information Science and Technology, Osaka University, Japan Email: {y-sabi,higo,kusumoto@ist.osaka-u.ac.jp

More information

How are Developers Treating License Inconsistency Issues? A Case Study on License Inconsistency Evolution in FOSS Projects

How are Developers Treating License Inconsistency Issues? A Case Study on License Inconsistency Evolution in FOSS Projects How are Developers Treating License Inconsistency Issues? A Case Study on License Inconsistency Evolution in FOSS Projects Yuhao Wu 1(B), Yuki Manabe 2, Daniel M. German 3, and Katsuro Inoue 1 1 Graduate

More information

Code Clone Detection Technique Using Program Execution Traces

Code Clone Detection Technique Using Program Execution Traces 1,a) 2,b) 1,c) Code Clone Detection Technique Using Program Execution Traces Masakazu Ioka 1,a) Norihiro Yoshida 2,b) Katsuro Inoue 1,c) Abstract: Code clone is a code fragment that has identical or similar

More information

A Technique to Detect Multi-grained Code Clones

A Technique to Detect Multi-grained Code Clones Detection Time The Number of Detectable Clones A Technique to Detect Multi-grained Code Clones Yusuke Yuki, Yoshiki Higo, and Shinji Kusumoto Graduate School of Information Science and Technology, Osaka

More information

PAPER Proposing and Evaluating Clone Detection Approaches with Preprocessing Input Source Files

PAPER Proposing and Evaluating Clone Detection Approaches with Preprocessing Input Source Files IEICE TRANS. INF. & SYST., VOL.E98 D, NO.2 FEBRUARY 2015 325 PAPER Proposing and Evaluating Clone Detection Approaches with Preprocessing Input Source Files Eunjong CHOI a), Nonmember, Norihiro YOSHIDA,

More information

An investigation into the impact of software licenses on copy-and-paste reuse among OSS projects

An investigation into the impact of software licenses on copy-and-paste reuse among OSS projects An investigation into the impact of software licenses on copy-and-paste reuse among OSS projects Yu Kashima, Yasuhiro Hayase, Norihiro Yoshida, Yuki Manabe, Katsuro Inoue Graduate School of Information

More information

Lecture 25 Clone Detection CCFinder. EE 382V Spring 2009 Software Evolution - Instructor Miryung Kim

Lecture 25 Clone Detection CCFinder. EE 382V Spring 2009 Software Evolution - Instructor Miryung Kim Lecture 25 Clone Detection CCFinder Today s Agenda (1) Recap of Polymetric Views Class Presentation Suchitra (advocate) Reza (skeptic) Today s Agenda (2) CCFinder, Kamiya et al. TSE 2002 Recap of Polymetric

More information

Study and Analysis of Object-Oriented Languages using Hybrid Clone Detection Technique

Study and Analysis of Object-Oriented Languages using Hybrid Clone Detection Technique Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 6 (2017) pp. 1635-1649 Research India Publications http://www.ripublication.com Study and Analysis of Object-Oriented

More information

DETECTING SIMPLE AND FILE CLONES IN SOFTWARE

DETECTING SIMPLE AND FILE CLONES IN SOFTWARE DETECTING SIMPLE AND FILE CLONES IN SOFTWARE *S.Ajithkumar, P.Gnanagurupandian, M.Senthilvadivelan, Final year Information Technology **Mr.K.Palraj ME, Assistant Professor, ABSTRACT: The objective of this

More information

Problematic Code Clones Identification using Multiple Detection Results

Problematic Code Clones Identification using Multiple Detection Results Problematic Code Clones Identification using Multiple Detection Results Yoshiki Higo, Ken-ichi Sawa, and Shinji Kusumoto Graduate School of Information Science and Technology, Osaka University, 1-5, Yamadaoka,

More information

Small Scale Detection

Small Scale Detection Software Reengineering SRe2LIC Duplication Lab Session February 2008 Code duplication is the top Bad Code Smell according to Kent Beck. In the lecture it has been said that duplication in general has many

More information

Searching for Configurations in Clone Evaluation A Replication Study

Searching for Configurations in Clone Evaluation A Replication Study Searching for Configurations in Clone Evaluation A Replication Study Chaiyong Ragkhitwetsagul 1, Matheus Paixao 1, Manal Adham 1 Saheed Busari 1, Jens Krinke 1 and John H. Drake 2 1 University College

More information

Folding Repeated Instructions for Improving Token-based Code Clone Detection

Folding Repeated Instructions for Improving Token-based Code Clone Detection 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation Folding Repeated Instructions for Improving Token-based Code Clone Detection Hiroaki Murakami, Keisuke Hotta, Yoshiki

More information

Toward a Taxonomy of Clones in Source Code: A Case Study

Toward a Taxonomy of Clones in Source Code: A Case Study Toward a Taxonomy of Clones in Source Code: A Case Study Cory Kapser and Michael W. Godfrey Software Architecture Group (SWAG) School of Computer Science, University of Waterloo fcjkapser, migodg@uwaterloo.ca

More information

Relation of Code Clones and Change Couplings

Relation of Code Clones and Change Couplings Relation of Code Clones and Change Couplings Reto Geiger, Beat Fluri, Harald C. Gall, and Martin Pinzger s.e.a.l. software evolution and architecture lab, Department of Informatics, University of Zurich,

More information

Finding Extract Method Refactoring Opportunities by Analyzing Development History

Finding Extract Method Refactoring Opportunities by Analyzing Development History 2017 IEEE 41st Annual Computer Software and Applications Conference Finding Extract Refactoring Opportunities by Analyzing Development History Ayaka Imazato, Yoshiki Higo, Keisuke Hotta, and Shinji Kusumoto

More information

Research Article An Empirical Study on the Impact of Duplicate Code

Research Article An Empirical Study on the Impact of Duplicate Code Advances in Software Engineering Volume 212, Article ID 938296, 22 pages doi:1.1155/212/938296 Research Article An Empirical Study on the Impact of Duplicate Code Keisuke Hotta, Yui Sasaki, Yukiko Sano,

More information

A Proposal of Refactoring Method for Existing Program Using Code Clone Detection and Impact Analysis Method

A Proposal of Refactoring Method for Existing Program Using Code Clone Detection and Impact Analysis Method A Proposal of Refactoring Method for Existing Program Using Code Clone Detection and Impact Analysis Method 1 Masakazu Takahashi, 2 Yunarso Anang, 3 Reiji Nanba, 4 Naoya Uchiyama, 5 Yoshimichi Watanabe

More information

CCFinderSW: Clone Detection Tool with Flexible Multilingual Tokenization

CCFinderSW: Clone Detection Tool with Flexible Multilingual Tokenization 2017 24th Asia-Pacific Software Engineering Conference CCFinderSW: Clone Detection Tool with Flexible Multilingual Tokenization Yuichi Semura, Norihiro Yoshida, Eunjong Choi and Katsuro Inoue Osaka University,

More information

IJREAS Volume 2, Issue 2 (February 2012) ISSN: SOFTWARE CLONING IN EXTREME PROGRAMMING ENVIRONMENT ABSTRACT

IJREAS Volume 2, Issue 2 (February 2012) ISSN: SOFTWARE CLONING IN EXTREME PROGRAMMING ENVIRONMENT ABSTRACT SOFTWARE CLONING IN EXTREME PROGRAMMING ENVIRONMENT Ginika Mahajan* Ashima** ABSTRACT Software systems are evolving by adding new functions and modifying existing functions over time. Through the evolution,

More information

Reusing Reused Code II. CODE SUGGESTION ARCHITECTURE. A. Overview

Reusing Reused Code II. CODE SUGGESTION ARCHITECTURE. A. Overview Reusing Reused Tomoya Ishihara, Keisuke Hotta, Yoshiki Higo, Shinji Kusumoto Graduate School of Information Science and Technology, Osaka University 1-5, Yamadaoka, Suita, Osaka, 565-0871, Japan {t-ishihr,

More information

KClone: A Proposed Approach to Fast Precise Code Clone Detection

KClone: A Proposed Approach to Fast Precise Code Clone Detection KClone: A Proposed Approach to Fast Precise Code Clone Detection Yue Jia 1, David Binkley 2, Mark Harman 1, Jens Krinke 1 and Makoto Matsushita 3 1 King s College London 2 Loyola College in Maryland 3

More information

Visualizing Clone Cohesion and Coupling

Visualizing Clone Cohesion and Coupling Visualizing Cohesion and Coupling Zhen Ming Jiang University of Waterloo zmjiang@cs.uwaterloo.ca Ahmed E. Hassan University of Victoria ahmed@ece.uvic.ca Richard C. Holt University of Waterloo holt@plg.uwaterloo.ca

More information

CoxR: Open Source Development History Search System

CoxR: Open Source Development History Search System CoxR: Open Source Development History Search System Makoto Matsushita, Kei Sasaki and Katsuro Inoue Graduate School of Information Science and Technology, Osaka University 1-3, Machikaneyama-cho, Toyonaka-shi,

More information

Structural and Semantic Code Analysis for Program Comprehension

Structural and Semantic Code Analysis for Program Comprehension Title Author(s) Structural and Semantic Code Analysis for Program Comprehension 楊, 嘉晨 Citation Issue Date Text Version ETD URL https://doi.org/10.18910/55845 DOI 10.18910/55845 rights Structural and Semantic

More information

Identifying, Tailoring, and Suggesting Form Template Method Refactoring Opportunities with Program Dependence Graph

Identifying, Tailoring, and Suggesting Form Template Method Refactoring Opportunities with Program Dependence Graph Identifying, Tailoring, and Suggesting Form Template Method Refactoring Opportunities with Program Dependence Graph Keisuke Hotta, Yoshiki Higo, Shinji Kusumoto Graduate School of Information and Science

More information

Automatic Mining of Functionally Equivalent Code Fragments via Random Testing. Lingxiao Jiang and Zhendong Su

Automatic Mining of Functionally Equivalent Code Fragments via Random Testing. Lingxiao Jiang and Zhendong Su Automatic Mining of Functionally Equivalent Code Fragments via Random Testing Lingxiao Jiang and Zhendong Su Cloning in Software Development How New Software Product Cloning in Software Development Search

More information

Clone Detection using Textual and Metric Analysis to figure out all Types of Clones

Clone Detection using Textual and Metric Analysis to figure out all Types of Clones Detection using Textual and Metric Analysis to figure out all Types of s Kodhai.E 1, Perumal.A 2, and Kanmani.S 3 1 SMVEC, Dept. of Information Technology, Puducherry, India Email: kodhaiej@yahoo.co.in

More information

Token based clone detection using program slicing

Token based clone detection using program slicing Token based clone detection using program slicing Rajnish Kumar PEC University of Technology Rajnish_pawar90@yahoo.com Prof. Shilpa PEC University of Technology Shilpaverma.pec@gmail.com Abstract Software

More information

A Tool for Multilingual Detection of Code Clones Using Syntax Definitions

A Tool for Multilingual Detection of Code Clones Using Syntax Definitions 1,a) 2 3 1 CCFinder CCFinder 1 CCFinderSW 1 ANTLR CCFinderSW ANTLR A Tool for Multilingual Detection of Code Clones Using Syntax Definitions Yuichi Semura 1,a) Norihiro Yoshida 2 Eunjong Choi 3 Katsuro

More information

Incremental Code Clone Detection: A PDG-based Approach

Incremental Code Clone Detection: A PDG-based Approach Incremental Code Clone Detection: A PDG-based Approach Yoshiki Higo, Yasushi Ueda, Minoru Nishino, Shinji Kusumoto Graduate School of Information Science and Technology, Osaka University, 1-5, Yamadaoka,

More information

Detection and Behavior Identification of Higher-Level Clones in Software

Detection and Behavior Identification of Higher-Level Clones in Software Detection and Behavior Identification of Higher-Level Clones in Software Swarupa S. Bongale, Prof. K. B. Manwade D. Y. Patil College of Engg. & Tech., Shivaji University Kolhapur, India Ashokrao Mane Group

More information

Identification of Structural Clones Using Association Rule and Clustering

Identification of Structural Clones Using Association Rule and Clustering Identification of Structural Clones Using Association Rule and Clustering Dr.A.Muthu Kumaravel Dept. of MCA, Bharath University, Chennai-600073, India ABSTRACT: Code clones are similar program structures

More information

Exploring the Relations between Code Cloning and Programming Languages

Exploring the Relations between Code Cloning and Programming Languages Exploring the Relations between Code Cloning and Programming Languages Ilca Webster Department of Computer Science York University 1. Introduction Copying code within a software system and adapting it

More information

A Novel Ontology Metric Approach for Code Clone Detection Using FusionTechnique

A Novel Ontology Metric Approach for Code Clone Detection Using FusionTechnique A Novel Ontology Metric Approach for Code Clone Detection Using FusionTechnique 1 Syed MohdFazalulHaque, 2 Dr. V Srikanth, 3 Dr. E. Sreenivasa Reddy 1 Maulana Azad National Urdu University, 2 Professor,

More information

Detection of Non Continguous Clones in Software using Program Slicing

Detection of Non Continguous Clones in Software using Program Slicing Detection of Non Continguous Clones in Software using Program Slicing Er. Richa Grover 1 Er. Narender Rana 2 M.Tech in CSE 1 Astt. Proff. In C.S.E 2 GITM, Kurukshetra University, INDIA Abstract Code duplication

More information

SourcererCC -- Scaling Code Clone Detection to Big-Code

SourcererCC -- Scaling Code Clone Detection to Big-Code SourcererCC -- Scaling Code Clone Detection to Big-Code What did this paper do? SourcererCC a token-based clone detector, that can detect both exact and near-miss clones from large inter project repositories

More information

Falsification: An Advanced Tool for Detection of Duplex Code

Falsification: An Advanced Tool for Detection of Duplex Code Indian Journal of Science and Technology, Vol 9(39), DOI: 10.17485/ijst/2016/v9i39/96195, October 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Falsification: An Advanced Tool for Detection of

More information

A Metric-based Approach for Reconstructing Methods in Object-Oriented Systems

A Metric-based Approach for Reconstructing Methods in Object-Oriented Systems A Metric-based Approach for Reconstructing Methods in Object-Oriented Systems Tatsuya Miyake Yoshiki Higo Katsuro Inoue Graduate School of Information Science and Technology, Osaka University {t-miyake,higo,inoue@istosaka-uacjp

More information

DCCD: An Efficient and Scalable Distributed Code Clone Detection Technique for Big Code

DCCD: An Efficient and Scalable Distributed Code Clone Detection Technique for Big Code DCCD: An Efficient and Scalable Distributed Code Clone Detection Technique for Big Code Junaid Akram (Member, IEEE), Zhendong Shi, Majid Mumtaz and Luo Ping State Key Laboratory of Information Security,

More information

Duplication de code: un défi pour l assurance qualité des logiciels?

Duplication de code: un défi pour l assurance qualité des logiciels? Duplication de code: un défi pour l assurance qualité des logiciels? Foutse Khomh S.W.A.T http://swat.polymtl.ca/ 2 JHotDraw 3 Code duplication can be 4 Example of code duplication Duplication to experiment

More information

Visualization of Clone Detection Results

Visualization of Clone Detection Results Visualization of Clone Detection Results Robert Tairas and Jeff Gray Department of Computer and Information Sciences University of Alabama at Birmingham Birmingham, AL 5294-1170 1-205-94-221 {tairasr,

More information

Er. Himanshi Vashisht, Sanjay Bharadwaj, Sushma Sharma

Er. Himanshi Vashisht, Sanjay Bharadwaj, Sushma Sharma International Journal Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 8 ISSN : 2456-3307 DOI : https://doi.org/10.32628/cseit183833 Impact

More information

Source Code Reuse Evaluation by Using Real/Potential Copy and Paste

Source Code Reuse Evaluation by Using Real/Potential Copy and Paste Source Code Reuse Evaluation by Using Real/Potential Copy and Paste Takafumi Ohta, Hiroaki Murakami, Hiroshi Igaki, Yoshiki Higo, and Shinji Kusumoto Graduate School of Information Science and Technology,

More information

Multi-Project Software Engineering: An Example

Multi-Project Software Engineering: An Example Multi-Project Software Engineering: An Example Pankaj K Garg garg@zeesource.net Zee Source 1684 Nightingale Avenue, Suite 201, Sunnyvale, CA 94087, USA Thomas Gschwind tom@infosys.tuwien.ac.at Technische

More information

Dr. Sushil Garg Professor, Dept. of Computer Science & Applications, College City, India

Dr. Sushil Garg Professor, Dept. of Computer Science & Applications, College City, India Volume 3, Issue 11, November 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Study of Different

More information

Quick Parser Development Using Modified Compilers and Generated Syntax Rules

Quick Parser Development Using Modified Compilers and Generated Syntax Rules Quick Parser Development Using Modified Compilers and Generated Syntax Rules KAZUAKI MAEDA Department of Business Administration and Information Science, Chubu University 1200 Matsumoto, Kasugai, Aichi,

More information

A Study on Inappropriately Partitioned Commits How Much and What Kinds of IP Commits in Java Projects?

A Study on Inappropriately Partitioned Commits How Much and What Kinds of IP Commits in Java Projects? How Much and What Kinds of IP Commits in Java Projects? Ryo Arima r-arima@ist.osaka-u.ac.jp Yoshiki Higo higo@ist.osaka-u.ac.jp Shinji Kusumoto kusumoto@ist.osaka-u.ac.jp ABSTRACT When we use code repositories,

More information

Cloning by Accident?

Cloning by Accident? Cloning by Accident? An Empirical Study of Source Code Cloning Across Software Systems Project Report for CS 846: Software Evolution and Design Winter 2005 By Raihan Al-Ekram, rekram@swag.uwaterloo.ca

More information

On the Robustness of Clone Detection to Code Obfuscation

On the Robustness of Clone Detection to Code Obfuscation On the Robustness of Clone Detection to Code Obfuscation Sandro Schulze TU Braunschweig Braunschweig, Germany sandro.schulze@tu-braunschweig.de Daniel Meyer University of Magdeburg Magdeburg, Germany Daniel3.Meyer@st.ovgu.de

More information

Very-Large Scale Code Clone Analysis and Visualization of Open Source Programs Using Distributed CCFinder: D-CCFinder

Very-Large Scale Code Clone Analysis and Visualization of Open Source Programs Using Distributed CCFinder: D-CCFinder Very-Large Scale Code Clone Analysis and Visualization of Open Source Programs Using Distributed CCFinder: D-CCFinder Simone Livieri Yoshiki Higo Makoto Matushita Katsuro Inoue Graduate School of Information

More information

Query-based Filtering and Graphical View Generation for Clone Analysis

Query-based Filtering and Graphical View Generation for Clone Analysis Int. Conf. on Software Maintenance, Beijing, Sept 2008 Query-based Filtering and Graphical View Generation for Clone Analysis Yali Zhang 1, Hamid Abdul Basit 2, Stan Jarzabek 1, Dang Anh 1, and Melvin

More information

Method-Level Code Clone Modification using Refactoring Techniques for Clone Maintenance

Method-Level Code Clone Modification using Refactoring Techniques for Clone Maintenance Method-Level Code Clone Modification using Refactoring Techniques for Clone Maintenance E. Kodhai 1, S. Kanmani 2 1 Research Scholar, Department of CSE, Pondicherry Engineering College, Puducherry, India.

More information

Deckard: Scalable and Accurate Tree-based Detection of Code Clones. Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, Stephane Glondu

Deckard: Scalable and Accurate Tree-based Detection of Code Clones. Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, Stephane Glondu Deckard: Scalable and Accurate Tree-based Detection of Code Clones Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, Stephane Glondu The Problem Find similar code in large code bases, often referred to as

More information

CCLearner: A Deep Learning-Based Clone Detection Approach

CCLearner: A Deep Learning-Based Clone Detection Approach CCLearner: A Deep Learning-Based Clone Detection Approach Liuqing Li, He Feng, Wenjie Zhuang, Na Meng and Barbara Ryder Department of Computer Science, Virginia Tech Blacksburg, VA, USA {liuqing, fenghe,

More information

To Enhance Type 4 Clone Detection in Clone Testing Swati Sharma #1, Priyanka Mehta #2 1 M.Tech Scholar,

To Enhance Type 4 Clone Detection in Clone Testing Swati Sharma #1, Priyanka Mehta #2 1 M.Tech Scholar, To Enhance Type 4 Clone Detection in Clone Testing Swati Sharma #1, Priyanka Mehta #2 1 M.Tech Scholar, 2 Head of Department, Department of Computer Science & Engineering, Universal Institute of Engineering

More information

A Weighted Layered Approach for Code Clone Detection

A Weighted Layered Approach for Code Clone Detection Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 12, December 2014,

More information

Parallel and Distributed Code Clone Detection using Sequential Pattern Mining

Parallel and Distributed Code Clone Detection using Sequential Pattern Mining Parallel and Distributed Code Clone Detection using Sequential Pattern Mining Ali El-Matarawy Faculty of Computers and Information, Cairo University Mohammad El-Ramly Faculty of Computers and Information,

More information

Algorithm to Detect Non-Contiguous Clones with High Precision

Algorithm to Detect Non-Contiguous Clones with High Precision Algorithm to Detect Non-Contiguous Clones with High Precision Sonam Gupta Research Scholar, Suresh Gyan Vihar University, Jaipur, Rajasthan, India Dr. P.C. Gupta Department of Computer Science and Engineering

More information

CCFinder: A Multi-Linguistic Token-based Code Clone Detection System for Large Scale Source Code

CCFinder: A Multi-Linguistic Token-based Code Clone Detection System for Large Scale Source Code CCFinder: A Multi-Linguistic Token-based Code Clone Detection System for Large Scale Source Code Toshihiro Kamiya, Shinji Kusumoto, Member, IEEE, and Katsuro Inoue, Member, IEEEE Abstract A code clone

More information

Detecting and Analyzing Code Clones in HDL

Detecting and Analyzing Code Clones in HDL Detecting and Analyzing Code Clones in HDL Kyohei Uemura, Akira Mori, Kenji Fujiwara, Eunjong Choi, and Hajimu Iida Nara Institute of Science and Technology, Japan {uemura.kyohei.ub9@is, choi@is, iida@itc}.naist.jp

More information

DCC / ICEx / UFMG. Software Code Clone. Eduardo Figueiredo.

DCC / ICEx / UFMG. Software Code Clone. Eduardo Figueiredo. DCC / ICEx / UFMG Software Code Clone Eduardo Figueiredo http://www.dcc.ufmg.br/~figueiredo Code Clone Code Clone, also called Duplicated Code, is a well known code smell in software systems Code clones

More information

T-SQL Training: T-SQL for SQL Server for Developers

T-SQL Training: T-SQL for SQL Server for Developers Duration: 3 days T-SQL Training Overview T-SQL for SQL Server for Developers training teaches developers all the Transact-SQL skills they need to develop queries and views, and manipulate data in a SQL

More information

Classification Model for Code Clones Based on Machine Learning

Classification Model for Code Clones Based on Machine Learning Empirical Software Engineering manuscript No. (will be inserted by the editor) Classification Model for Code Clones Based on Machine Learning Jiachen Yang Keisuke Hotta Yoshiki Higo Hiroshi Igaki Shinji

More information

Keywords Clone detection, metrics computation, hybrid approach, complexity, byte code

Keywords Clone detection, metrics computation, hybrid approach, complexity, byte code Volume 3, Issue 5, May 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Emerging Approach

More information

CHAPTER 4 OBJECT ORIENTED COMPLEXITY METRICS MODEL

CHAPTER 4 OBJECT ORIENTED COMPLEXITY METRICS MODEL 64 CHAPTER 4 OBJECT ORIENTED COMPLEXITY METRICS MODEL 4.1 INTRODUCTION Customers measure the aspects of the final product to determine whether it meets the requirements and provides sufficient quality.

More information

Clone Tracker: Tracking Inconsistent Clone Changes in A Clone Group

Clone Tracker: Tracking Inconsistent Clone Changes in A Clone Group Clone Tracker: Tracking Inconsistent Clone Changes in A Clone Group MD. JUBAIR IBNA MOSTAFA BSSE 0614 A Thesis Submitted to the Bachelor of Science in Software Engineering Program Office of the Institute

More information

What Kinds of Refactorings are Co-occurred? An Analysis of Eclipse Usage Datasets

What Kinds of Refactorings are Co-occurred? An Analysis of Eclipse Usage Datasets 2014 6th International Workshop on Empirical Software Engineering in Practice What Kinds of Refactorings are Co-occurred? An Analysis of Eclipse Usage Datasets Tsubasa Saika 1, Eunjong Choi 1, Norihiro

More information

Ctcompare: Comparing Multiple Code Trees for Similarity

Ctcompare: Comparing Multiple Code Trees for Similarity Ctcompare: Comparing Multiple Code Trees for Similarity Warren Toomey School of IT, Bond University Using lexical analysis with techniques borrowed from DNA sequencing, multiple code trees can be quickly

More information

arxiv: v1 [cs.se] 25 Mar 2014

arxiv: v1 [cs.se] 25 Mar 2014 Do the Fix Ingredients Already Exist? An Empirical Inquiry into the Redundancy Assumptions of Program Repair Approaches Matias Martinez Westley Weimer Martin Monperrus University of Lille & INRIA, France

More information

Visual Analytics Tools for the Global Change Assessment Model. Michael Steptoe, Ross Maciejewski, & Robert Link Arizona State University

Visual Analytics Tools for the Global Change Assessment Model. Michael Steptoe, Ross Maciejewski, & Robert Link Arizona State University Visual Analytics Tools for the Global Change Assessment Model Michael Steptoe, Ross Maciejewski, & Robert Link Arizona State University GCAM Simulation When exploring the impact of various conditions or

More information

Master Thesis. Type-3 Code Clone Detection Using The Smith-Waterman Algorithm

Master Thesis. Type-3 Code Clone Detection Using The Smith-Waterman Algorithm Master Thesis Title Type-3 Code Clone Detection Using The Smith-Waterman Algorithm Supervisor Prof. Shinji KUSUMOTO by Hiroaki MURAKAMI February 5, 2013 Department of Computer Science Graduate School of

More information

A Novel Technique for Retrieving Source Code Duplication

A Novel Technique for Retrieving Source Code Duplication A Novel Technique for Retrieving Source Code Duplication Yoshihisa Udagawa Computer Science Department, Faculty of Engineering Tokyo Polytechnic University Atsugi-city, Kanagawa, Japan udagawa@cs.t-kougei.ac.jp

More information

2IMP25 Software Evolution. Code duplication. Alexander Serebrenik

2IMP25 Software Evolution. Code duplication. Alexander Serebrenik 2IMP25 Software Evolution Code duplication Alexander Serebrenik Assignments Assignment 1 Median 7, mean 6.87 My grades: 3-3-1-1-2-1-4 You ve done much better than me ;-) Clear, fair grading BUT tedious

More information

Identification of File and Directory Level Near-Miss Clones For Higher Level Cloning Sonam Gupta, Vishwachi

Identification of File and Directory Level Near-Miss Clones For Higher Level Cloning Sonam Gupta, Vishwachi International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 8958, Volume-3, Issue-8 Identification of File and Directory Level Near-Miss Clones For Higher Level Cloning Sonam Gupta,

More information

Data-Flow Analysis Foundations

Data-Flow Analysis Foundations CS 301 Spring 2016 Meetings April 11 Data-Flow Foundations Plan Source Program Lexical Syntax Semantic Intermediate Code Generation Machine- Independent Optimization Code Generation Target Program This

More information

Code Syntax-Comparison Algorithm based on Type-Redefinition-Preprocessing and Rehash Classification

Code Syntax-Comparison Algorithm based on Type-Redefinition-Preprocessing and Rehash Classification 320 JOURNAL OF MULTIMEDIA, VOL. 6, NO. 4, AUGUST 2011 Code Syntax-Comparison Algorithm based on Type-Redefinition-Preprocessing and Rehash Classification Baojiang Cui 1,2 1 Beijing University of Posts

More information

Challenges in Mining Whole Software Universe

Challenges in Mining Whole Software Universe Challenges in Mining Whole Software Universe Katsuro Inoue Osaka University Cover Ratio Analyzing Evolution of kern_malloc 1 1 2 3 4 13 15 19 24 36 58 62 65 1 Lites 1.0 Kernel Source 2 Archive - CMU Mach

More information

Analysis of Coding Patterns over Software Versions

Analysis of Coding Patterns over Software Versions Information and Media Technologies 0(): - (05) reprinted from: Computer Software (): 0- (05) Analysis of Coding Patterns over Software Versions Hironori Date, Takashi Ishio, Makoto Matsushita, Katsuro

More information

Refactoring Model of Legacy Software in Smart Grid based on Cloned Codes Detection

Refactoring Model of Legacy Software in Smart Grid based on Cloned Codes Detection www.ijcsi.org 296 Refactoring Model of Legacy Software in Smart Grid based on Cloned Codes Detection Fanqi Meng 1, Zhaoyang Qu 2 and Xiaoli Guo 3 1 School of Information Engineering, Northeast Dianli University,Jilin,

More information

Rochester Institute of Technology. Making personalized education scalable using Sequence Alignment Algorithm

Rochester Institute of Technology. Making personalized education scalable using Sequence Alignment Algorithm Rochester Institute of Technology Making personalized education scalable using Sequence Alignment Algorithm Submitted by: Lakhan Bhojwani Advisor: Dr. Carlos Rivero 1 1. Abstract There are many ways proposed

More information

Assertion with Aspect

Assertion with Aspect Assertion with Aspect Takashi Ishio, Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue Graduate School of Engineering Science, PRESTO, Japan Science and Technology Agency Osaka University 1-3 Machikaneyama-cho,

More information

CS 6V Using Reverse Engineering Practices to Improve Systems-of-Systems Understanding. Tom Hill

CS 6V Using Reverse Engineering Practices to Improve Systems-of-Systems Understanding. Tom Hill CS 6V81-05 Using Reverse Engineering Practices to Improve Systems-of-Systems Understanding Tom Hill Department of Computer Science The University of Texas at Dallas November 11 th, 2011 Outline Research

More information

Outline. Problem statement

Outline. Problem statement Outline CS 6V81-05 Using Reverse Engineering Practices to Improve Systems-of-Systems Understanding Tom Hill Department of Computer Science The University of Texas at Dallas November 11 th, 2011 Research

More information

SHINOBI: A Real-Time Code Clone Detection Tool for Software Maintenance

SHINOBI: A Real-Time Code Clone Detection Tool for Software Maintenance : A Real-Time Code Clone Detection Tool for Software Maintenance Takanobu Yamashina Hidetake Uwano Kyohei Fushida Yasutaka Kamei Masataka Nagura Shinji Kawaguchi Hajimu Iida Nara Institute of Science and

More information

Abstract. We define an origin relationship as follows, based on [12].

Abstract. We define an origin relationship as follows, based on [12]. When Functions Change Their Names: Automatic Detection of Origin Relationships Sunghun Kim, Kai Pan, E. James Whitehead, Jr. Dept. of Computer Science University of California, Santa Cruz Santa Cruz, CA

More information

Software Quality Understanding by Analysis of Abundant Data (SQUAAD)

Software Quality Understanding by Analysis of Abundant Data (SQUAAD) Software Quality Understanding by Analysis of Abundant Data (SQUAAD) By Pooyan Behnamghader Advisor: Barry Boehm ARR 2018 March 13, 2018 1 Outline Motivation Software Quality Evolution Challenges SQUAAD

More information

Gapped Code Clone Detection with Lightweight Source Code Analysis

Gapped Code Clone Detection with Lightweight Source Code Analysis Gapped Code Clone Detection with Lightweight Source Code Analysis Hiroaki Murakami, Keisuke Hotta, Yoshiki Higo, Hiroshi Igaki, Shinji Kusumoto Graduate School of Information Science and Technology, Osaka

More information

COMPARISON AND EVALUATION ON METRICS

COMPARISON AND EVALUATION ON METRICS COMPARISON AND EVALUATION ON METRICS BASED APPROACH FOR DETECTING CODE CLONE D. Gayathri Devi 1 1 Department of Computer Science, Karpagam University, Coimbatore, Tamilnadu dgayadevi@gmail.com Abstract

More information

Sub-clones: Considering the Part Rather than the Whole

Sub-clones: Considering the Part Rather than the Whole Sub-clones: Considering the Part Rather than the Whole Robert Tairas 1 and Jeff Gray 2 1 Department of Computer and Information Sciences, University of Alabama at Birmingham, Birmingham, AL 2 Department

More information