Software Clone Detection. Kevin Tang Mar. 29, 2012
|
|
- Lillian Jordan
- 6 years ago
- Views:
Transcription
1 Software Clone Detection Kevin Tang Mar. 29, 2012
2 Software Clone Detection Introduction Reasons for Code Duplication Drawbacks of Code Duplication Clone Definitions in the Literature Detection Techniques and Tools Applications and Related Research
3 Introduction Code cloning reusing code fragments by copying and pasting Studies show that about 5% to 20% of a software systems can contain duplicated code What is difficult? Detect clone with modifications Differentiate original from copies
4 Reasons for Code Duplication (1) Development Strategy Reusing existing code by copying and pasting (with or without minor modification) Forking reuse of similar solutions with the hope that they will be diverged significantly Merging of two similar systems
5 Reasons for Code Duplication (2) Maintenance Benefits Risk in developing new code Ensuring robustness in life-critical systems High cost of function calls in real time programs
6 Reasons for Code Duplication (3) Overcoming Underlying Limitations Writing reusable code is error-prone Difficulty in understanding large system Lack of ownership of the code to be reused Wrong method of measuring developer's productivity
7 Reasons for Code Duplication (4) Cloning By Accident Coincidentally implementing the same logic by different developers
8 Drawbacks of Code Duplication Increased probability of bug propagation Increased probability of introducing a new bug Increased probability of bad design Increased difficulty in system maintenance and improvement
9 Clone Definitions in the Literature Code Clone Types Textual Similarity: Based on the textual similarity we distinguish the following types of clones Type I Type II Type III Functional Similarity: If the functionalities of the two code fragments are identical or similar Type IV
10 Clone Definitions in the Literature Code Clone Types Type I: Identical code fragments except for variations in whitespace (may be also variations in layout) and comments if (a >= b) { c = d + b; // Comment1 d = d + 1; }else{ c = d - a; //Comment2 } if (a>=b) { // Comment1' c=d+b; d=d+1; }else{ // Comment2' c=d-a; }
11 Clone Definitions in the Literature Code Clone Types Type II: Structurally/syntactically identical fragments except for variations in identifiers, literals, types, layout and comments if (a >= b) { c = d + b; // Comment1 d = d + 1; }else{ c = d - a; //Comment2 } if (m >= n) { // Comment1' y = x + n; x = x + 5; //Comment3 }else{ y = x - m; //Comment2 }
12 Clone Definitions in the Literature Code Clone Types Type III: Copied fragments with further modifications. Statements can be changed, added or removed in addition to variations in identifiers, literals, types, layout and comments if (a >= b) { c = d + b; // Comment1 d = d + 1; }else{ c = d - a; //Comment2 } if (a >= b) { c = d + b; //Comment1 e = 1; //new statement d = d + 1; }else{ c = d - a; //Comment2 }
13 Clone Definitions in the Literature Code Clone Types Type IV: Two or more code fragments that perform the same computation but implemented through different syntactic variants int i, j=1; for (i=1; i<=n; i++){ j=j*i; } int factorial(int n){ if (n == 0){ return 1; }else{ return n * factorial(n-1); }
14 Detection Techniques and Tools Text-based Techniques (1) The target source program is considered as sequence of lines/strings Code fragments are compared with each other to find sequences of same text/strings Little or no transformation/normalization is performed on the source code before starting the comparison; in most cases, the raw source code is used directly
15 Detection Techniques and Tools Text-based Techniques (2) Problems Line Break Identifier changes Parenthesis removal/adding for a single statement Fragment 1: 1: dwframegrouplength = 1; 2: for (dwcnt=2; dwcnt<=64; dwcnt*=2) 3: { 4: if ((( uloutrate/dwcnt) * dwcnt)!= 5: uloutrate) 6: { 7: dwframegrouplength *=2; 8: } 9: } Fragment 2: 1: framegrouplength = 1; 2: for (Cnt = 2; Cnt <= 64; Cnt *=2){ 3: if (( ( Rate / Cnt) * Cnt)!= rate) 4: framegrouplength *=2; 5: }
16 Detection Techniques and Tools Text-based Techniques (3) Improvements the following filtering and/or transformation / normalizations are applied Comments Removal: Ignores all kinds of comments in the source code depending on the language of interest Whitespace Removal: Removes tabs, and new line(s) and other blanks spaces Normalization: Some basic normalization can be applied on the source code
17 Detection Techniques and Tools Text-based Techniques (4) Normalization operations on source code elements Operation Language element Example Replacement 1 Literal string abort 2 Literal character y. 3 Literal integer Literal decimal Identifier counter p 6 Basic numerical type int, short, long, double Example: num 7 Function name Main() foo() int increment(int counter) => num foo(num p)
18 Detection Techniques and Tools Token-based Techniques (1) The entire source system is parsed / transformed to a sequence of tokens Scan for finding duplicated subsequences of tokens CCFinder Each line of source code is divided into tokens by a lexer then concatenated into a single token sequence Each identifier related to types, variables, and constants is replaced with a special token A suffix-tree based sub-string matching algorithm is then used to find the similar sub-sequences A mapping is required for obtaining the clone pair information with respect to the original source code
19 Detection Techniques and Tools Token-based Techniques (2) Other improvements Suppress insignificant token classes (e.g., access modifiers of Java) that may cause noise in detection (RTF) Assign the same ID to different types int, short, long, float, double (RTF) Frequent subsequence mining technique where a frequent subsequence can be interleaved in its supporting sequences (CP-Miner) Token-based techniques are also used in the area of plagiarism detection
20 Detection Techniques and Tools Tree-based Techniques (1) Source code is pared to a parse tree or an abstract syntax tree (AST) Search for similar subtrees with some tree matching techniques Return corresponding source code of the similar subtrees found CloneDR Compares subtrees by characterization metrics based on a hash function through tree matching
21 Detection Techniques and Tools Tree-based Techniques (2) Evans and Fraser proposed structural abstraction Suppose the clone a[?] = x; occurs twice: a[i] = x; // lexical a[i+1] = x; // structural
22 Detection Techniques and Tools AST + Suffix Tree Techniques The AST nodes are serialized in preorder traversal A suffix tree is created for these serialized AST nodes Cut the AST node sequences according to their syntactic region Find clones on transformed AST using sequence matching algorithm
23 Detection Techniques and Tools PDG-based Techniques Program Dependency Graph (PDG) contains the control flow and data flow information of a program Isomorphic subgraph matching algorithm applied for finding similar subgraphs PDG-DUP finds isomorphic PDG subgraphs using program slicing group identified clones together while preserving the semantics of the original code for automatic procedure extraction to support software refactoring PDG-based approaches are robust to reordered statements, insertion and deletion of code, but they are not scalable to large size programs
24 Detection Techniques and Tools Comparison of the detection approaches Approach Portability Precision Recall Scalability Textbased High, needs lexer at most 100%, No false positives as checks for exact copies Low, only finds exact copies Depends on Comparison algorithms Tokenbased Medium, Needs lexer transformation rules Low, due to normalization and/or transformation returns many false positives High, can detect most clones High with suffixtree algorithm Treebased Low, needs parser High, parse-tree considers structural info also Low, cannot detect all types of clones Depends, how comparison is made PDGbased Low, needs PDG generator High, considers structural and semantic info too Medium, cannot detect all clones Low, graph matching is costly
25 Detection Techniques and Tools Clone Detection Tools Tool Supported Language Approach Background Dup C, C++, Java Line based/text based Jplag C, C++, Java Token/Greedy String Academic Academic CloneDr C, C++, JAVA, COBOL AST/Tree matching Commercial DupLoc Language Independent Line/Exact string matching Academic PDG-DUP C, C++ PDG/Slicing Academic
26 Applications and Related Research for Clone Detection Plagiarism detection Origin analysis and software evolution Multi-version program analysis Bug detection Malicious software detection
27 Question Does the else-block contain clones of ifblock? If it does, what is the type of the clone? What detection technique can be applied to detect these clones? if (a > 10) i = i + 1; d = a * sin(b); M(i) = d; else if (a > 5) j = j+2; %increment d = b * sin(b); M(i)=d; else j = j + 1; d = a * sin(b + 1); M(i) = d; end k = k + 1; end
28 Question Given a MATLAB source code file containing exactly 1 outer if-statement. Write a program to print all Type II clones of if-block in else-block. You may assume no empty statement / line. Hint: ifblock = ifstmtnode.getifblock(0).getchild(1); elseblock = ifstmtnode.getelseblock().getchild(0); if (a > 10) i = i + 1; d = a * sin(b); M(i) = d; else if (a > 5) j = j+2; %increment d = b * sin(b); M(i)=d; else j = j + 1; d = a * sin(b + 1); M(i) = d; end k = k + 1; end
29 References Chanchal Kumar Roy, James R. Cordy. A Survey on Software Clone Detection Research pdf Magdalena Balazinska, Ettore Merlo, Michel Dagenais, Bruno Lagüe, Kostas Kontogiannis. Advanced Clone-Analysis to Support Object-Oriented System Refactoring
DCC / ICEx / UFMG. Software Code Clone. Eduardo Figueiredo.
DCC / ICEx / UFMG Software Code Clone Eduardo Figueiredo http://www.dcc.ufmg.br/~figueiredo Code Clone Code Clone, also called Duplicated Code, is a well known code smell in software systems Code clones
More informationLecture 25 Clone Detection CCFinder. EE 382V Spring 2009 Software Evolution - Instructor Miryung Kim
Lecture 25 Clone Detection CCFinder Today s Agenda (1) Recap of Polymetric Views Class Presentation Suchitra (advocate) Reza (skeptic) Today s Agenda (2) CCFinder, Kamiya et al. TSE 2002 Recap of Polymetric
More informationDetection of Non Continguous Clones in Software using Program Slicing
Detection of Non Continguous Clones in Software using Program Slicing Er. Richa Grover 1 Er. Narender Rana 2 M.Tech in CSE 1 Astt. Proff. In C.S.E 2 GITM, Kurukshetra University, INDIA Abstract Code duplication
More informationToken based clone detection using program slicing
Token based clone detection using program slicing Rajnish Kumar PEC University of Technology Rajnish_pawar90@yahoo.com Prof. Shilpa PEC University of Technology Shilpaverma.pec@gmail.com Abstract Software
More informationA Survey of Software Clone Detection Techniques
A Survey of Software Detection Techniques Abdullah Sheneamer Department of Computer Science University of Colorado at Colo. Springs, USA Colorado Springs, USA asheneam@uccs.edu Jugal Kalita Department
More informationCode duplication in Software Systems: A Survey
Code duplication in Software Systems: A Survey G. Anil kumar 1 Dr. C.R.K.Reddy 2 Dr. A. Govardhan 3 A. Ratna Raju 4 1,4 MGIT, Dept. of Computer science, Hyderabad, India Email: anilgkumar@mgit.ac.in, ratnaraju@mgit.ac.in
More informationClone Detection using Textual and Metric Analysis to figure out all Types of Clones
Detection using Textual and Metric Analysis to figure out all Types of s Kodhai.E 1, Perumal.A 2, and Kanmani.S 3 1 SMVEC, Dept. of Information Technology, Puducherry, India Email: kodhaiej@yahoo.co.in
More informationAn Effective Approach for Detecting Code Clones
An Effective Approach for Detecting Code Clones Girija Gupta #1, Indu Singh *2 # M.Tech Student( CSE) JCD College of Engineering, Affiliated to Guru Jambheshwar University,Hisar,India * Assistant Professor(
More informationKeywords Clone detection, metrics computation, hybrid approach, complexity, byte code
Volume 3, Issue 5, May 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Emerging Approach
More informationTo Enhance Type 4 Clone Detection in Clone Testing Swati Sharma #1, Priyanka Mehta #2 1 M.Tech Scholar,
To Enhance Type 4 Clone Detection in Clone Testing Swati Sharma #1, Priyanka Mehta #2 1 M.Tech Scholar, 2 Head of Department, Department of Computer Science & Engineering, Universal Institute of Engineering
More informationDr. Sushil Garg Professor, Dept. of Computer Science & Applications, College City, India
Volume 3, Issue 11, November 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Study of Different
More informationA Tree Kernel Based Approach for Clone Detection
A Tree Kernel Based Approach for Clone Detection Anna Corazza 1, Sergio Di Martino 1, Valerio Maggio 1, Giuseppe Scanniello 2 1) University of Naples Federico II 2) University of Basilicata Outline Background
More informationEnhancing Program Dependency Graph Based Clone Detection Using Approximate Subgraph Matching
Enhancing Program Dependency Graph Based Clone Detection Using Approximate Subgraph Matching A DISSERTATION SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENT FOR THE AWARD OF THE DEGREE OF MASTER OF
More informationA Novel Technique for Retrieving Source Code Duplication
A Novel Technique for Retrieving Source Code Duplication Yoshihisa Udagawa Computer Science Department, Faculty of Engineering Tokyo Polytechnic University Atsugi-city, Kanagawa, Japan udagawa@cs.t-kougei.ac.jp
More informationDetection and Analysis of Software Clones
Detection and Analysis of Software Clones By Abdullah Mohammad Sheneamer M.S., University of Colorado at Colorado Springs, Computer Science, USA, 2012 B.S., University of King Abdulaziz, Computer Science,
More informationCode Clone Detector: A Hybrid Approach on Java Byte Code
Code Clone Detector: A Hybrid Approach on Java Byte Code Thesis submitted in partial fulfillment of the requirements for the award of degree of Master of Engineering in Software Engineering Submitted By
More informationThe goal of this project is to enhance the identification of code duplication which can result in high cost reductions for a minimal price.
Code Duplication New Proposal Dolores Zage, Wayne Zage Ball State University June 1, 2017 July 31, 2018 Long Term Goals The goal of this project is to enhance the identification of code duplication which
More informationCode Duplication. Harald Gall seal.ifi.uzh.ch/evolution
Code Duplication Harald Gall seal.ifi.uzh.ch/evolution Code is Copied Small Example from the Mozilla Distribution (Milestone 9) Extract from /dom/src/base/nslocation.cpp [432] NS_IMETHODIMP [467] NS_IMETHODIMP
More information1/30/18. Overview. Code Clones. Code Clone Categorization. Code Clones. Code Clone Categorization. Key Points of Code Clones
Overview Code Clones Definition and categories Clone detection Clone removal refactoring Spiros Mancoridis[1] Modified by Na Meng 2 Code Clones Code clone is a code fragment in source files that is identical
More informationKeywords Code cloning, Clone detection, Software metrics, Potential clones, Clone pairs, Clone classes. Fig. 1 Code with clones
Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Detection of Potential
More informationCross Language Higher Level Clone Detection- Between Two Different Object Oriented Programming Language Source Codes
Cross Language Higher Level Clone Detection- Between Two Different Object Oriented Programming Language Source Codes 1 K. Vidhya, 2 N. Sumathi, 3 D. Ramya, 1, 2 Assistant Professor 3 PG Student, Dept.
More informationDesign Code Clone Detection System uses Optimal and Intelligence Technique based on Software Engineering
Volume 8, No. 5, May-June 2017 International Journal of Advanced Research in Computer Science RESEARCH PAPER Available Online at www.ijarcs.info ISSN No. 0976-5697 Design Code Clone Detection System uses
More informationA Simple Syntax-Directed Translator
Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called
More informationClone Detection Using Abstract Syntax Suffix Trees
Clone Detection Using Abstract Syntax Suffix Trees Rainer Koschke, Raimar Falke, Pierre Frenzel University of Bremen, Germany http://www.informatik.uni-bremen.de/st/ {koschke,rfalke,saint}@informatik.uni-bremen.de
More informationThe Reverse Engineering in Oriented Aspect Detection of semantics clones
International Journal of Scientific & Engineering Research Volume 3, Issue 5, May-2012 1 The Reverse Engineering in Oriented Aspect Detection of semantics clones Amel Belmabrouk, Belhadri Messabih Abstract-Attention
More informationStudy and Analysis of Object-Oriented Languages using Hybrid Clone Detection Technique
Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 6 (2017) pp. 1635-1649 Research India Publications http://www.ripublication.com Study and Analysis of Object-Oriented
More informationTowards the Code Clone Analysis in Heterogeneous Software Products
Towards the Code Clone Analysis in Heterogeneous Software Products 11 TIJANA VISLAVSKI, ZORAN BUDIMAC AND GORDANA RAKIĆ, University of Novi Sad Code clones are parts of source code that were usually created
More information2IS55 Software Evolution. Code duplication. Alexander Serebrenik
2IS55 Software Evolution Code duplication Alexander Serebrenik Assignments Assignment 2: February 28, 2014, 23:59. Assignment 3 already open. Code duplication Individual Deadline: March 17, 2013, 23:59.
More information2IMP25 Software Evolution. Code duplication. Alexander Serebrenik
2IMP25 Software Evolution Code duplication Alexander Serebrenik Assignments Assignment 1 Median 7, mean 6.87 My grades: 3-3-1-1-2-1-4 You ve done much better than me ;-) Clear, fair grading BUT tedious
More informationCS321 Languages and Compiler Design I. Winter 2012 Lecture 4
CS321 Languages and Compiler Design I Winter 2012 Lecture 4 1 LEXICAL ANALYSIS Convert source file characters into token stream. Remove content-free characters (comments, whitespace,...) Detect lexical
More informationCS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find
CS1622 Lecture 15 Semantic Analysis CS 1622 Lecture 15 1 Semantic Analysis How to build symbol tables How to use them to find multiply-declared and undeclared variables. How to perform type checking CS
More informationSourcererCC -- Scaling Code Clone Detection to Big-Code
SourcererCC -- Scaling Code Clone Detection to Big-Code What did this paper do? SourcererCC a token-based clone detector, that can detect both exact and near-miss clones from large inter project repositories
More information2IS55 Software Evolution. Code duplication. Alexander Serebrenik
2IS55 Software Evolution Code duplication Alexander Serebrenik Assignments Assignment 2: March 5, 2013, 23:59. Assignment 3 already open. Code duplication Individual Deadline: March 12, 2013, 23:59. /
More informationEVALUATION OF TOKEN BASED TOOLS ON THE BASIS OF CLONE METRICS
EVALUATION OF TOKEN BASED TOOLS ON THE BASIS OF CLONE METRICS Rupinder Kaur, Harpreet Kaur, Prabhjot Kaur Abstract The area of clone detection has considerably evolved over the last decade, leading to
More informationClone Detection Using Dependence. Analysis and Lexical Analysis. Final Report
Clone Detection Using Dependence Analysis and Lexical Analysis Final Report Yue JIA 0636332 Supervised by Professor Mark Harman Department of Computer Science King s College London September 2007 Acknowledgments
More informationPerformance Evaluation and Comparative Analysis of Code- Clone-Detection Techniques and Tools
, pp. 31-50 http://dx.doi.org/10.14257/ijseia.2017.11.3.04 Performance Evaluation and Comparative Analysis of Code- Clone-Detection Techniques and Tools Harpreet Kaur 1 * (Assistant Professor) and Raman
More informationScenario-Based Comparison of Clone Detection Techniques
The 16th IEEE International Conference on Program Comprehension Scenario-Based Comparison of Clone Detection Techniques Chanchal K. Roy and James R. Cordy School of Computing, Queen s University Kingston,
More informationAutomatic Mining of Functionally Equivalent Code Fragments via Random Testing. Lingxiao Jiang and Zhendong Su
Automatic Mining of Functionally Equivalent Code Fragments via Random Testing Lingxiao Jiang and Zhendong Su Cloning in Software Development How New Software Product Cloning in Software Development Search
More informationAn Approach to Detect Clones in Class Diagram Based on Suffix Array
An Approach to Detect Clones in Class Diagram Based on Suffix Array Amandeep Kaur, Computer Science and Engg. Department, BBSBEC Fatehgarh Sahib, Punjab, India. Manpreet Kaur, Computer Science and Engg.
More informationSoftware Clone Detection Using Cosine Distance Similarity
Software Clone Detection Using Cosine Distance Similarity A Dissertation SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENT FOR THE AWARD OF DEGREE OF MASTER OF TECHNOLOGY IN COMPUTER SCIENCE & ENGINEERING
More informationRefactoring Support Based on Code Clone Analysis
Refactoring Support Based on Code Clone Analysis Yoshiki Higo 1,Toshihiro Kamiya 2, Shinji Kusumoto 1 and Katsuro Inoue 1 1 Graduate School of Information Science and Technology, Osaka University, Toyonaka,
More information1 Lexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler
More informationCCFinderSW: Clone Detection Tool with Flexible Multilingual Tokenization
2017 24th Asia-Pacific Software Engineering Conference CCFinderSW: Clone Detection Tool with Flexible Multilingual Tokenization Yuichi Semura, Norihiro Yoshida, Eunjong Choi and Katsuro Inoue Osaka University,
More informationFolding Repeated Instructions for Improving Token-based Code Clone Detection
2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation Folding Repeated Instructions for Improving Token-based Code Clone Detection Hiroaki Murakami, Keisuke Hotta, Yoshiki
More informationIntroduction to Lexical Analysis
Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples
More informationSoftware Similarity Analysis. c April 28, 2011 Christian Collberg
Software Similarity Analysis c April 28, 2011 Christian Collberg Clone detection Duplicates are the result of copy-paste-modify programming. 2/49 Clone detection Duplicates are the result of copy-paste-modify
More informationCS 406: Syntax Directed Translation
CS 406: Syntax Directed Translation Stefan D. Bruda Winter 2015 SYNTAX DIRECTED TRANSLATION Syntax-directed translation the source language translation is completely driven by the parser The parsing process
More informationCOMPARISON AND EVALUATION ON METRICS
COMPARISON AND EVALUATION ON METRICS BASED APPROACH FOR DETECTING CODE CLONE D. Gayathri Devi 1 1 Department of Computer Science, Karpagam University, Coimbatore, Tamilnadu dgayadevi@gmail.com Abstract
More informationDeckard: Scalable and Accurate Tree-based Detection of Code Clones. Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, Stephane Glondu
Deckard: Scalable and Accurate Tree-based Detection of Code Clones Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, Stephane Glondu The Problem Find similar code in large code bases, often referred to as
More informationDealing with Clones in Software : A Practical Approach from Detection towards Management
Dealing with Clones in Software : A Practical Approach from Detection towards Management A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for
More informationThe University of Saskatchewan Department of Computer Science. Technical Report #
The University of Saskatchewan Department of Computer Science Technical Report #2012-03 The Road to Software Clone Management: ASurvey Minhaz F. Zibran Chanchal K. Roy {minhaz.zibran, chanchal.roy}@usask.ca
More informationISSN: (PRINT) ISSN: (ONLINE)
IJRECE VOL. 5 ISSUE 2 APR.-JUNE. 217 ISSN: 2393-928 (PRINT) ISSN: 2348-2281 (ONLINE) Code Clone Detection Using Metrics Based Technique and Classification using Neural Network Sukhpreet Kaur 1, Prof. Manpreet
More informationAccuracy Enhancement in Code Clone Detection Using Advance Normalization
Accuracy Enhancement in Code Clone Detection Using Advance Normalization 1 Ritesh V. Patil, 2 S. D. Joshi, 3 Digvijay A. Ajagekar, 4 Priyanka A. Shirke, 5 Vivek P. Talekar, 6 Shubham D. Bankar 1 Research
More informationA Measurement of Similarity to Identify Identical Code Clones
The International Arab Journal of Information Technology, Vol. 12, No. 6A, 2015 735 A Measurement of Similarity to Identify Identical Code Clones Mythili ShanmughaSundaram and Sarala Subramani Department
More informationAn Exploratory Study on Interface Similarities in Code Clones
1 st WETSoDA, December 4, 2017 - Nanjing, China An Exploratory Study on Interface Similarities in Code Clones Md Rakib Hossain Misu, Abdus Satter, Kazi Sakib Institute of Information Technology University
More informationInternational Journal of Scientific & Engineering Research, Volume 8, Issue 2, February ISSN
International Journal of Scientific & Engineering Research, Volume 8, Issue 2, February-2017 164 DETECTION OF SOFTWARE REFACTORABILITY THROUGH SOFTWARE CLONES WITH DIFFRENT ALGORITHMS Ritika Rani 1,Pooja
More informationA Novel Ontology Metric Approach for Code Clone Detection Using FusionTechnique
A Novel Ontology Metric Approach for Code Clone Detection Using FusionTechnique 1 Syed MohdFazalulHaque, 2 Dr. V Srikanth, 3 Dr. E. Sreenivasa Reddy 1 Maulana Azad National Urdu University, 2 Professor,
More informationLexical analysis. Syntactical analysis. Semantical analysis. Intermediate code generation. Optimization. Code generation. Target specific optimization
Second round: the scanner Lexical analysis Syntactical analysis Semantical analysis Intermediate code generation Optimization Code generation Target specific optimization Lexical analysis (Chapter 3) Why
More informationCONVERTING CODE CLONES TO ASPECTS USING ALGORITHMIC APPROACH
CONVERTING CODE CLONES TO ASPECTS USING ALGORITHMIC APPROACH by Angad Singh Gakhar, B.Tech., Guru Gobind Singh Indraprastha University, 2009 A thesis submitted to the Faculty of Graduate and Postdoctoral
More informationCSCI 788 Computer Science MS Project
CSCI 788 Computer Science MS Project Capstone Project Report Title: Implementation and Analysis of Graph Algorithm Used for Software Plagiarism Identification Advisor: Dr. Carlos Rivero Student: Sowgandh
More informationClone Detection and Maintenance with AI Techniques. Na Meng Virginia Tech
Clone Detection and Maintenance with AI Techniques Na Meng Virginia Tech Code Clones Developers copy and paste code to improve programming productivity Clone detections tools are needed to help bug fixes
More informationSemantic Clone Detection Using Machine Learning
Semantic Clone Detection Using Machine Learning Abdullah Sheneamer University of Colorado Colorado Springs, CO USA 80918 Email: asheneam@uccs.edu Jugal Kalita University of Colorado Colorado Springs, CO
More informationOn Refactoring Support Based on Code Clone Dependency Relation
On Refactoring Support Based on Code Dependency Relation Norihiro Yoshida 1, Yoshiki Higo 1, Toshihiro Kamiya 2, Shinji Kusumoto 1, Katsuro Inoue 1 1 Graduate School of Information Science and Technology,
More informationExamples of attributes: values of evaluated subtrees, type information, source file coordinates,
1 2 3 Attributes can be added to the grammar symbols, and program fragments can be added as semantic actions to the grammar, to form a syntax-directed translation scheme. Some attributes may be set by
More informationCPS 506 Comparative Programming Languages. Syntax Specification
CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens
More informationCS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer
CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer Assigned: Thursday, September 16, 2004 Due: Tuesday, September 28, 2004, at 11:59pm September 16, 2004 1 Introduction Overview In this
More informationRe-usability based approach Reusability of code, logic, design and/or an entire system are the major reasons of code clone occurrence.
ISSN: 0976-3104 SPECIAL ISSUE: COMPUTER SCIENCE ARTICLE A DETAILED STUDY OF SOFTWARE CODE CLONING Annu Vashisht 1, Akanksha Sukhija 2, Arpita Verma 3, Prateek Jain 4 * 1,2,3 Department of Computer Science
More informationDetection and Behavior Identification of Higher-Level Clones in Software
Detection and Behavior Identification of Higher-Level Clones in Software Swarupa S. Bongale, Prof. K. B. Manwade D. Y. Patil College of Engg. & Tech., Shivaji University Kolhapur, India Ashokrao Mane Group
More informationCS164: Midterm I. Fall 2003
CS164: Midterm I Fall 2003 Please read all instructions (including these) carefully. Write your name, login, and circle the time of your section. Read each question carefully and think about what s being
More informationClone Detection Using Scope Trees
Int'l Conf. Software Eng. Research and Practice SERP'18 193 Clone Detection Using Scope Trees M. Mohammed and J. Fawcett Department of Computer Science and Electrical Engineering, Syracuse University,
More informationSyntax Errors; Static Semantics
Dealing with Syntax Errors Syntax Errors; Static Semantics Lecture 14 (from notes by R. Bodik) One purpose of the parser is to filter out errors that show up in parsing Later stages should not have to
More informationSimilar Code Detection and Elimination for Erlang Programs
Similar Code Detection and Elimination for Erlang Programs Huiqing Li and Simon Thompson School of Computing, University of Kent, UK {H.Li, S.J.Thompson}@kent.ac.uk Abstract. A well-known bad code smell
More informationOn the Robustness of Clone Detection to Code Obfuscation
On the Robustness of Clone Detection to Code Obfuscation Sandro Schulze TU Braunschweig Braunschweig, Germany sandro.schulze@tu-braunschweig.de Daniel Meyer University of Magdeburg Magdeburg, Germany Daniel3.Meyer@st.ovgu.de
More informationSyntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens
Syntactic Analysis CS45H: Programming Languages Lecture : Lexical Analysis Thomas Dillig Main Question: How to give structure to strings Analogy: Understanding an English sentence First, we separate a
More informationCompilers and Code Optimization EDOARDO FUSELLA
Compilers and Code Optimization EDOARDO FUSELLA The course covers Compiler architecture Pre-requisite Front-end Strong programming background in C, C++ Back-end LLVM Code optimization A case study: nu+
More informationRearranging the Order of Program Statements for Code Clone Detection
Rearranging the Order of Program Statements for Code Clone Detection Yusuke Sabi, Yoshiki Higo, Shinji Kusumoto Graduate School of Information Science and Technology, Osaka University, Japan Email: {y-sabi,higo,kusumoto@ist.osaka-u.ac.jp
More informationMeCC: Memory Comparisonbased Clone Detector
MeCC: Memory Comparisonbased Clone Detector Heejung Kim 1, Yungbum Jung 1, Sunghun Kim 2, and Kwangkeun Yi 1 1 Seoul National University 2 The Hong Kong University of Science and Technology http://ropas.snu.ac.kr/mecc/
More informationOn Refactoring for Open Source Java Program
On Refactoring for Open Source Java Program Yoshiki Higo 1,Toshihiro Kamiya 2, Shinji Kusumoto 1, Katsuro Inoue 1 and Yoshio Kataoka 3 1 Graduate School of Information Science and Technology, Osaka University
More informationThe Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.
The Compiler So Far CSC 4181 Compiler Construction Scanner - Lexical analysis Detects inputs with illegal tokens e.g.: main 5 (); Parser - Syntactic analysis Detects inputs with ill-formed parse trees
More informationClone Tracker: Tracking Inconsistent Clone Changes in A Clone Group
Clone Tracker: Tracking Inconsistent Clone Changes in A Clone Group MD. JUBAIR IBNA MOSTAFA BSSE 0614 A Thesis Submitted to the Bachelor of Science in Software Engineering Program Office of the Institute
More informationLexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!
Lexical Analysis Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Compiler Passes Analysis of input program (front-end) character stream
More informationA lexical analyzer generator for Standard ML. Version 1.6.0, October 1994
A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994 Andrew W. Appel 1 James S. Mattson David R. Tarditi 2 1 Department of Computer Science, Princeton University 2 School of Computer
More informationSemantic Analysis. Compiler Architecture
Processing Systems Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Source Compiler Architecture Front End Scanner (lexical tokens Parser (syntax Parse tree Semantic Analysis
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 2, Mar-Apr 2015
RESEARCH ARTICLE Code Clone Detection and Analysis Using Software Metrics and Neural Network-A Literature Review Balwinder Kumar [1], Dr. Satwinder Singh [2] Department of Computer Science Engineering
More informationIntroduction to Lexical Analysis
Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexical analyzers (lexers) Regular
More informationLexical Analysis. COMP 524, Spring 2014 Bryan Ward
Lexical Analysis COMP 524, Spring 2014 Bryan Ward Based in part on slides and notes by J. Erickson, S. Krishnan, B. Brandenburg, S. Olivier, A. Block and others The Big Picture Character Stream Scanner
More informationCSCI312 Principles of Programming Languages!
CSCI312 Principles of Programming Languages!! Chapter 3 Regular Expression and Lexer Xu Liu Recap! Copyright 2006 The McGraw-Hill Companies, Inc. Clite: Lexical Syntax! Input: a stream of characters from
More informationGrammars and Parsing. Paul Klint. Grammars and Parsing
Paul Klint Grammars and Languages are one of the most established areas of Natural Language Processing and Computer Science 2 N. Chomsky, Aspects of the theory of syntax, 1965 3 A Language...... is a (possibly
More informationPart VII. Querying XML The XQuery Data Model. Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2005/06 153
Part VII Querying XML The XQuery Data Model Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2005/06 153 Outline of this part 1 Querying XML Documents Overview 2 The XQuery Data Model The XQuery
More informationCode Clone Analysis and Application
Code Clone Analysis and Application Katsuro Inoue Osaka University Talk Structure Clone Detection CCFinder and Associate Tools Applications Summary of Code Clone Analysis and Application Clone Detection
More informationfor (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }
Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas
More informationMP 3 A Lexer for MiniJava
MP 3 A Lexer for MiniJava CS 421 Spring 2012 Revision 1.0 Assigned Wednesday, February 1, 2012 Due Tuesday, February 7, at 09:30 Extension 48 hours (penalty 20% of total points possible) Total points 43
More informationOn the effectiveness of clone detection by string matching
Research On the effectiveness of clone detection by string matching Stéphane Ducasse, Oscar Nierstrasz and Matthias Rieger Software Composition Group, Institute for Applied Mathematics and Computer Science,
More informationIncremental Clone Detection and Elimination for Erlang Programs
Incremental Clone Detection and Elimination for Erlang Programs Huiqing Li and Simon Thompson School of Computing, University of Kent, UK {H.Li, S.J.Thompson}@kent.ac.uk Abstract. A well-known bad code
More information좋은 발표란 무엇인가? 정영범 서울대학교 5th ROSAEC Workshop 2011년 1월 6일 목요일
5th ROSAEC Workshop ! " # $ Static Analysis of Multi-Staged Programs via Unstaging Translation Wontae Choi Baris Aktemur Kwangkeun Yi Seoul National University, Korea UIUC, USA & Ozyegin University,
More informationSyntax and Grammars 1 / 21
Syntax and Grammars 1 / 21 Outline What is a language? Abstract syntax and grammars Abstract syntax vs. concrete syntax Encoding grammars as Haskell data types What is a language? 2 / 21 What is a language?
More informationfor (i=1; i<=100000; i++) { x = sqrt (y); // square root function cout << x+i << endl; }
Ex: The difference between Compiler and Interpreter The interpreter actually carries out the computations specified in the source program. In other words, the output of a compiler is a program, whereas
More informationParsing and Pattern Recognition
Topics in IT 1 Parsing and Pattern Recognition Week 10 Lexical analysis College of Information Science and Engineering Ritsumeikan University 1 this week mid-term evaluation review lexical analysis its
More informationCS 6353 Compiler Construction Project Assignments
CS 6353 Compiler Construction Project Assignments In this project, you need to implement a compiler for a language defined in this handout. The programming language you need to use is C or C++ (and the
More informationScalable Code Clone Detection and Search based on Adaptive Prefix Filtering
Scalable Code Clone Detection and Search based on Adaptive Prefix Filtering Manziba Akanda Nishi a, Kostadin Damevski a a Department of Computer Science, Virginia Commonwealth University Abstract Code
More information