A Comparison of Code Similarity Analyzers C. Ragkhitwetsagul, J. Krinke, D. Clark
|
|
- Basil Day
- 6 years ago
- Views:
Transcription
1 A Comparison of Code Similarity Analyzers C. Ragkhitwetsagul, J. Krinke, D. Clark SCAM 16, EMSE (under reviewed) Photo: 1
2 When source code is copied and modified, which code similarity detection techniques or tools get the most accurate results? 2
3 Bellon et al. (TSE 2007) Roy et al. (Sci Comp Prog. 2009) Hage et al. (CSERC 2010) Biegel et al. (MSR 11) 3
4 1 The selected tools are limited to only a subset of clone or plagiarism detectors (and their parameters). 2 The results are based on different data sets. 4
5 5 30 tools
6 Pervasive Modifications From: /* ORIGINAL */ private static int partition (Comparable[] a, int lo, int hi) { int i = lo; int j = hi+1; Comparable v = a[lo]; while (true) { while (less(a[++i], v)) { if (i == hi) break; } while (less(v, a[--j])) { if (j == lo) break; } if (i >= j) break; exch(a, i, j); } exch(a, lo, j); return j; } /* PERVASIVELY MODIFIED CODE */ private static int partition (int[] bob, int left, int right){ int x = left; int y = right+1; for (;;) { while (less(bob[left],bob[--y])) if (y == left) break; while (less(bob[++x],bob[left])) if (x == right) break; if (x >= y) break; swap(bob, y, x); } swap(bob, y, left); return y; } SW Plagiarism clone evolution refactoring 6
7 7
8 pervasively modified code to be used in detection phase source obfuscator compiler bytecode obfuscator decompilers pervasively modified code original ARTIFICE javac ProGuard Krakatau BubbleSort.java EightQueens.java GuessWord.java TowerOfHanoi.java InfixConverter.java Kapreka_Tran.java MagicSquare.java RailRoadCar.java SLinkedList.java SqrtAlgorithm.java Procyon 8
9 Boiler-Plate Code Detection of SOurce COde re-use (SOCO). Flores E., Rosso P., Moreno L., Villatoro-Tello E. (2014) 9
10 Parameter Settings 10 Jonathan H. Ward (Wikipedia CC BY-SA 3.0)
11 11
12 Similarity Report orig orig no kraka tau orig no procy on orig pg kraka tau orig pg procy on no kraka tau no procy on pg kraka tau pg procy on Sqrt/ orig Sqrt/ Squr/ pg kraka tau Squr/ pg procy on InfConv/orig InfConv/artifice InfConv/orig_no_krakatau InfConv/orig_no_procyon InfConv/orig_pg_krakatau InfConv/orig_pg_procyon InfConv/artific_no_krakatau InfConv/artifice_no_procyon InfConv/artifice_pg_krakatau InfConv/artifice_pg_procyon Sqrt/orig Sqrt/artifice Square/artifice_pg_krakatau Square/artifice_pg_procyon
13 Similarity Threshold = 50 orig orig no kraka tau orig no procy on orig pg kraka tau orig pg procy on no kraka tau no procy on pg kraka tau pg procy on Sqrt/ orig Sqrt/ Squr/ pg kraka tau Squr/ pg procy on InfConv/orig InfConv/artifice InfConv/orig_no_krakatau InfConv/orig_no_procyon InfConv/orig_pg_krakatau InfConv/orig_pg_procyon InfConv/artific_no_krakatau InfConv/artifice_no_procyon InfConv/artifice_pg_krakatau InfConv/artifice_pg_procyon Sqrt/orig Sqrt/artifice Square/artifice_pg_krakatau Square/artifice_pg_procyon
14 Best Threshold 1.00 F-measure = F-measure Threshold Value (T) 14
15 Optimal Configuration Best Param Settings Best Threshold Pervasive: 14,880,000 pairwise comparisons SOCO: 99,816,528 pairwise comparisons Icons made by Freepik from is licensed by Creative Commons BY
16 Clone det. Plag det. Comp. Others ccfx deckard iclones nicad simian jplag-java jplag-text plaggie sherlock simjava simtext 7zncd-BZip2 7zncd-Deflate 7zncd-Deflate2 7zncd-LZMA 7zncd-Deflate64 7zncd-PPMd bzip2ncd gzipncd icd ncd-bzlib ncd-zlib xz-ncd bsdiff diff difflib fuzzywuzzy jellyfish ngram cosine Pervasive Mod F1
17 Clone det. Plag det. Comp. Others ccfx deckard iclones nicad simian jplag-java jplag-text plaggie sherlock simjava simtext 7zncd-BZip2 7zncd-Deflate 7zncd-Deflate2 7zncd-LZMA 7zncd-Deflate64 7zncd-PPMd bzip2ncd gzipncd icd ncd-bzlib ncd-zlib xz-ncd bsdiff diff difflib fuzzywuzzy jellyfish ngram cosine Boiler- Plate F1
18 Highly specialised source code similarity detection techniques and tools can perform better than more general, compression & textual similarity measures. Interesting: difflib and fuzzywuzzy. Icons made by Freepik from is licensed by Creative Commons BY
19 Optimal Configurations CCFX s Precision vs. Recall Measure Value ccfx s params b t Precision , 8, 9 Recall
20 CCFX Optimal Config. 20
21 b = 5, t = 11, 12 b = 19, t = 7, 8, 9 21
22 Pervasive Mod. Boiler- Plate
23 The optimal configurations derived from one data set has a detrimental impact on the similarity detection results for another data set. Cbuckley, Jpowell on en.wikipedia Icons made by Freepik from is licensed by Creative Commons BY
24 Normalisation by Decompilation Pervasively modified code Normalisation Normalised code Decompile Compile 24
25 Clone det. Plag det. Comp. Others ccfx deckard iclones nicad simian jplag-java jplag-text plaggie sherlock simjava simtext 7zncd-BZip2 7zncd-Deflate 7zncd-Deflate2 7zncd-LZMA 7zncd-LZMA2 7zncd-PPMd bzip2ncd gzipncd icd ncd-bzlib ncd-zlib xz-ncd bsdiff diff py-difflib py-fuzzywuzzy py-jellyfish py-ngram py-sklearn F1 F1 Orig. Dec.
26 Compilation and decompilation can be used as an effective normalisation method that greatly improves similarity detection on Java source code (with statistical significance) IWSC 17 Icons made by Freepik from is licensed by Creative Commons BY
27 Ranked Results Only Top k Results ccfx fuzzywuzzy ncd-bzlib bzip2ncd simian gzipncd ncd-zlib jplag-java difflib jplag-text simjava gzipncd ncd-zlib sherlock jplag-text 7zncd-PPMd xzncd 7zncd-Deflate64 7zncd-Deflate fuzzywuzzy Mean Average Precision (MAP) Pervasive Mod Mean Average Precision (MAP) Boiler-Plate 27
28 Distribution of tool s F1 scores vs. pervasive mod. type Original O = original Obfuscator A = Artifice (source) Pg = ProGuard (bytecode) Decompiler K = Krakatau Pc = Procyon 28
29 F1 Score Tool O A K Pc Pg K Pg Pc A K A Pc A Pg K A Pg Pc Original O = original Obfuscator A = Artifice (source) Pg = ProGuard (bytecode) Decompiler K = Krakatau Pc = Procyon ccfx deckard iclones nicad simian jplag-java jplag-text plaggie sherlock simjava simtext 7zncd-BZip2 7zncd-Deflate 7zncd-Deflate2 7zncd-LZMA 7zncd-LZMA2 7zncd-PPMd bzip2ncd gzipncd icd ncd-zlib ncd-bzlib xzncd bsdiff diff difflib fuzzywuzzy jellyfish ngram cosine
30 To Sum Up A Comparison of Code Similarity Analyzers 30 Research Note: Website:
CloPlag. A Study of Effects of Code Obfuscation to Clone/Plagiarism Detection Tools. Jens Krinke, Chaiyong Ragkhitwetsagul, Albert Cabré Juan
CloPlag A Study of Effects of Code Obfuscation to Clone/Plagiarism Detection Tools Jens Krinke, Chaiyong Ragkhitwetsagul, Albert Cabré Juan 1 Outline Background Motivation and Research Questions Tools
More informationCloPlag. A Study of Effects of Code Obfuscation to Code Similarity Detection Tools. Chaiyong Ragkhitwetsagul, Jens Krinke, Albert Cabré Juan
CloPlag A Study of Effects of Code Obfuscati to Code Similarity Detecti Tools Chaiyg Ragkhitwetsagul, Jens Krinke, Albert Cabré Juan Cled Code vs Plagiarised Code A result from source code reuse by copying
More informationResearch Note RN/17/04. A Comparison of Code Similarity Analysers
UCL DEPARTMENT OF COMPUTER SCIENCE Research Note RN/17/04 A Comparison of Code Similarity Analysers 20 February 2017 Chaiyong Ragkhitwetsagul Jens Krinke David Clark Abstract Source code analysis to detect
More informationCode Duplication: A Measurable Technical Debt?
UCL 2014 wmjkucl 05/12/2016 Code Duplication: A Measurable Technical Debt? Jens Krinke Centre for Research on Evolution, Search & Testing Software Systems Engineering Group Department of Computer Science
More informationA Comparison of Code Similarity Analysers
Noname manuscript No. (will be inserted by the editor) A Comparison of Code Similarity Analysers Chaiyong Ragkhitwetsagul Jens Krinke David Clark Received: date / Accepted: date Abstract Copying and pasting
More informationMeasuring Code Similarity in Large-scaled Code Corpora
Measuring Code Similarity in Large-scaled Code Corpora Chaiyong Ragkhitwetsagul CREST, Department of Computer Science University College London, UK Abstract Source code similarity measurement is a fundamental
More informationUsing Compilation/Decompilation to Enhance Clone Detection
Using Compilation/Decompilation to Enhance Clone Detection Chaiyong Ragkhitwetsagul, Jens Krinke University College London, UK Abstract We study effects of compilation and decompilation to code clone detection
More informationSource Code Plagiarism Detection using Machine Learning
Source Code Plagiarism Detection using Machine Learning Utrecht University Daniël Heres August 2017 Contents 1 Introduction 1 1.1 Formal Description.......................... 3 1.2 Thesis Overview...........................
More informationSearching for Configurations in Clone Evaluation A Replication Study
Searching for Configurations in Clone Evaluation A Replication Study Chaiyong Ragkhitwetsagul 1, Matheus Paixao 1, Manal Adham 1 Saheed Busari 1, Jens Krinke 1 and John H. Drake 2 1 University College
More informationPlagiarism detection for Java: a tool comparison
Plagiarism detection for Java: a tool comparison Jurriaan Hage e-mail: jur@cs.uu.nl homepage: http://www.cs.uu.nl/people/jur/ Joint work with Peter Rademaker and Nikè van Vugt. Department of Information
More informationOn the Robustness of Clone Detection to Code Obfuscation
On the Robustness of Clone Detection to Code Obfuscation Sandro Schulze TU Braunschweig Braunschweig, Germany sandro.schulze@tu-braunschweig.de Daniel Meyer University of Magdeburg Magdeburg, Germany Daniel3.Meyer@st.ovgu.de
More informationLarge-Scale Clone Detection and Benchmarking
Large-Scale Clone Detection and Benchmarking A Thesis Submitted to the College of Graduate and Postdoctoral Studies in Partial Fulfillment of the Requirements for the degree of Doctor of Philosophy in
More informationInstructor-Centric Source Code Plagiarism Detection and Plagiarism Corpus
Instructor-Centric Source Code Plagiarism Detection and Plagiarism Corpus Jonathan Y. H. Poon, Kazunari Sugiyama, Yee Fan Tan, Min-Yen Kan National University of Singapore Introduction Plagiarism in undergraduate
More informationDuplication de code: un défi pour l assurance qualité des logiciels?
Duplication de code: un défi pour l assurance qualité des logiciels? Foutse Khomh S.W.A.T http://swat.polymtl.ca/ 2 JHotDraw 3 Code duplication can be 4 Example of code duplication Duplication to experiment
More informationOverview of SOCO Track on the Detection of SOurce COde Re-use
PAN@FIRE: Overview of SOCO Track on the Detection of SOurce COde Re-use Enrique Flores 1, Paolo Rosso 1 Lidia Moreno 1, and Esaú Villatoro-Tello 2 1 Universitat Politècnica de València, Spain, {eflores,prosso,lmoreno}@dsic.upv.es
More informationClone Detection and Maintenance with AI Techniques. Na Meng Virginia Tech
Clone Detection and Maintenance with AI Techniques Na Meng Virginia Tech Code Clones Developers copy and paste code to improve programming productivity Clone detections tools are needed to help bug fixes
More informationToxic Code Snippets on Stack Overflow
1 Toxic Code Snippets on Stack Overflow Chaiyong Ragkhitwetsagul, Jens Krinke, Matheus Paixao, Giuseppe Bianco, Rocco Oliveto University College London, London, UK University of Molise, Campobasso, Italy
More informationCompiling clones: What happens?
Compiling clones: What happens? Oleksii Kononenko, Cheng Zhang, and Michael W. Godfrey David R. Cheriton School of Computer Science University of Waterloo, Canada {okononen, c16zhang, migod}@uwaterloo.ca
More informationInternational Journal of Scientific & Engineering Research, Volume 8, Issue 2, February ISSN
International Journal of Scientific & Engineering Research, Volume 8, Issue 2, February-2017 164 DETECTION OF SOFTWARE REFACTORABILITY THROUGH SOFTWARE CLONES WITH DIFFRENT ALGORITHMS Ritika Rani 1,Pooja
More informationA Technique to Detect Multi-grained Code Clones
Detection Time The Number of Detectable Clones A Technique to Detect Multi-grained Code Clones Yusuke Yuki, Yoshiki Higo, and Shinji Kusumoto Graduate School of Information Science and Technology, Osaka
More informationA Framework for Evaluating Mobile App Repackaging Detection Algorithms
A Framework for Evaluating Mobile App Repackaging Detection Algorithms Heqing Huang, PhD Candidate. Sencun Zhu, Peng Liu (Presenter) & Dinghao Wu, PhDs Repackaging Process Downloaded APK file Unpack Repackaged
More informationCS/COE 1501
CS/COE 1501 www.cs.pitt.edu/~lipschultz/cs1501/ Sorting The sorting problem Given a list of n items, place the items in a given order Ascending or descending Numerical Alphabetical etc. First, we ll review
More informationISSN: (PRINT) ISSN: (ONLINE)
IJRECE VOL. 5 ISSUE 2 APR.-JUNE. 217 ISSN: 2393-928 (PRINT) ISSN: 2348-2281 (ONLINE) Code Clone Detection Using Metrics Based Technique and Classification using Neural Network Sukhpreet Kaur 1, Prof. Manpreet
More informationAn Information Retrieval Approach for Source Code Plagiarism Detection
-2014: An Information Retrieval Approach for Source Code Plagiarism Detection Debasis Ganguly, Gareth J. F. Jones CNGL: Centre for Global Intelligent Content School of Computing, Dublin City University
More informationAn Approach to Source Code Plagiarism Detection Based on Abstract Implementation Structure Diagram
An Approach to Source Code Plagiarism Detection Based on Abstract Implementation Structure Diagram Shuang Guo 1, 2, b 1, 2, a, JianBin Liu 1 School of Computer, Science Beijing Information Science & Technology
More informationCS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University
CS 112 Introduction to Computing II Wayne Snyder Department Boston University Today Recursive Sorting Methods and their Complexity: Mergesort Conclusions on sorting algorithms and complexity Next Time:
More informationImproving Plagiarism Detection. Tomas Votroubek
Improving Plagiarism Detection Tomas Votroubek January 7, 208 Prohlašuji, že jsem předloženou práci vypracoval samostatně a že jsem uvedl veškeré použité informační zdroje v souladu s Metodickým pokynem
More informationswitch case Logic Syntax Basics Functionality Rules Nested switch switch case Comp Sci 1570 Introduction to C++
Comp Sci 1570 Introduction to C++ Outline 1 Outline 1 Outline 1 switch ( e x p r e s s i o n ) { case c o n s t a n t 1 : group of statements 1; break ; case c o n s t a n t 2 : group of statements 2;
More informationCosc 241 Programming and Problem Solving Lecture 17 (30/4/18) Quicksort
1 Cosc 241 Programming and Problem Solving Lecture 17 (30/4/18) Quicksort Michael Albert michael.albert@cs.otago.ac.nz Keywords: sorting, quicksort The limits of sorting performance Algorithms which sort
More informationMeCC: Memory Comparisonbased Clone Detector
MeCC: Memory Comparisonbased Clone Detector Heejung Kim 1, Yungbum Jung 1, Sunghun Kim 2, and Kwangkeun Yi 1 1 Seoul National University 2 The Hong Kong University of Science and Technology http://ropas.snu.ac.kr/mecc/
More informationEnhancing Source-Based Clone Detection Using Intermediate Representation
Enhancing Source-Based Detection Using Intermediate Representation Gehan M. K. Selim School of Computing, Queens University Kingston, Ontario, Canada, K7L3N6 gehan@cs.queensu.ca Abstract Detecting software
More informationForkSim: Generating Software Forks for Evaluating Cross-Project Similarity Analysis Tools
ForkSim: Generating Software Forks for Evaluating Cross-Project Similarity Analysis Tools Jeffrey Svajlenko Chanchal K. Roy University of Saskatchewan, Canada {jeff.svajlenko, chanchal.roy}@usask.ca Slawomir
More informationExtracting Code Clones for Refactoring Using Combinations of Clone Metrics
Extracting Code Clones for Refactoring Using Combinations of Clone Metrics Eunjong Choi 1, Norihiro Yoshida 2, Takashi Ishio 1, Katsuro Inoue 1, Tateki Sano 3 1 Graduate School of Information Science and
More informationCS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University
9/5/6 CS Introduction to Computing II Wayne Snyder Department Boston University Today: Arrays (D and D) Methods Program structure Fields vs local variables Next time: Program structure continued: Classes
More informationAn Exploratory Study on Interface Similarities in Code Clones
1 st WETSoDA, December 4, 2017 - Nanjing, China An Exploratory Study on Interface Similarities in Code Clones Md Rakib Hossain Misu, Abdus Satter, Kazi Sakib Institute of Information Technology University
More informationSoftware Clone Detection. Kevin Tang Mar. 29, 2012
Software Clone Detection Kevin Tang Mar. 29, 2012 Software Clone Detection Introduction Reasons for Code Duplication Drawbacks of Code Duplication Clone Definitions in the Literature Detection Techniques
More informationPertinence of Lexical and Structural Features for Plagiarism Detection in Source Code
Pertinence of Lexical and Structural Features for Plagiarism Detection in Source Code A. Ramírez-de-la-Cruz, G. Ramírez-de-la-Rosa, C. Sánchez-Sánchez, H. Jiménez-Salazar, and E. Villatoro-Tello Departamento
More informationAn Automatic Framework for Extracting and Classifying Near-Miss Clone Genealogies
An Automatic Framework for Extracting and Classifying Near-Miss Clone Genealogies Ripon K. Saha Chanchal K. Roy Kevin A. Schneider Department of Computer Science, University of Saskatchewan, Canada {ripon.saha,
More informationPhase-based algorithms for file migration
Phase-based algorithms for file migration Marcin Bieńkowski Jarek Byrka Marcin Mucha University of Wrocław University of Warsaw HALG 2018 (previously on ICALP 2017) File migration Weighted graph "2 File
More informationDCCD: An Efficient and Scalable Distributed Code Clone Detection Technique for Big Code
DCCD: An Efficient and Scalable Distributed Code Clone Detection Technique for Big Code Junaid Akram (Member, IEEE), Zhendong Shi, Majid Mumtaz and Luo Ping State Key Laboratory of Information Security,
More informationScalable Code Clone Detection and Search based on Adaptive Prefix Filtering
Scalable Code Clone Detection and Search based on Adaptive Prefix Filtering Manziba Akanda Nishi a, Kostadin Damevski a a Department of Computer Science, Virginia Commonwealth University Abstract Code
More informationCtcompare: Comparing Multiple Code Trees for Similarity
Ctcompare: Comparing Multiple Code Trees for Similarity Warren Toomey School of IT, Bond University Using lexical analysis with techniques borrowed from DNA sequencing, multiple code trees can be quickly
More information2IS55 Software Evolution. Code duplication. Alexander Serebrenik
2IS55 Software Evolution Code duplication Alexander Serebrenik Assignments Assignment 2: February 28, 2014, 23:59. Assignment 3 already open. Code duplication Individual Deadline: March 17, 2013, 23:59.
More informationFolding Repeated Instructions for Improving Token-based Code Clone Detection
2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation Folding Repeated Instructions for Improving Token-based Code Clone Detection Hiroaki Murakami, Keisuke Hotta, Yoshiki
More informationCOS 226 Midterm Fall 2007
1. Partitioning (5 points). Give the result of partitioning the array with standard Quicksort partitioning (taking the rightmost N as the partitioning element). P A R T I T I O N I N G Q U E S T I O N
More informationIncremental Clone Detection and Elimination for Erlang Programs
Incremental Clone Detection and Elimination for Erlang Programs Huiqing Li and Simon Thompson School of Computing, University of Kent, UK {H.Li, S.J.Thompson}@kent.ac.uk Abstract. A well-known bad code
More informationKeywords Clone detection, metrics computation, hybrid approach, complexity, byte code
Volume 3, Issue 5, May 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Emerging Approach
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 2, Mar-Apr 2015
RESEARCH ARTICLE Code Clone Detection and Analysis Using Software Metrics and Neural Network-A Literature Review Balwinder Kumar [1], Dr. Satwinder Singh [2] Department of Computer Science Engineering
More informationDetection and Analysis of Software Clones
Detection and Analysis of Software Clones By Abdullah Mohammad Sheneamer M.S., University of Colorado at Colorado Springs, Computer Science, USA, 2012 B.S., University of King Abdulaziz, Computer Science,
More informationLecture 1: Overview of Java
Lecture 1: Overview of Java What is java? Developed by Sun Microsystems (James Gosling) A general-purpose object-oriented language Based on C/C++ Designed for easy Web/Internet applications Widespread
More informationCOMP 202 Recursion. CONTENTS: Recursion. COMP Recursion 1
COMP 202 Recursion CONTENTS: Recursion COMP 202 - Recursion 1 Recursive Thinking A recursive definition is one which uses the word or concept being defined in the definition itself COMP 202 - Recursion
More information2IMP25 Software Evolution. Code duplication. Alexander Serebrenik
2IMP25 Software Evolution Code duplication Alexander Serebrenik Assignments Assignment 1 Median 7, mean 6.87 My grades: 3-3-1-1-2-1-4 You ve done much better than me ;-) Clear, fair grading BUT tedious
More informationOSSPolice - Identifying Open-Source License Violation and 1-day Security Risk at Large Scale
OSSPolice - Identifying Open-Source License Violation and 1-day Security Risk at Large Scale Ruian Duan, Ashish Bijlani, Meng Xu Taesoo Kim, Wenke Lee ACM CCS 2017 1 Background Open Source Software (OSS)
More informationLecture 4: MIPS Instruction Set
Lecture 4: MIPS Instruction Set No class on Tuesday Today s topic: MIPS instructions Code examples 1 Instruction Set Understanding the language of the hardware is key to understanding the hardware/software
More informationJava Archives Search Engine Using Byte Code as Information Source
Java Archives Search Engine Using Byte Code as Information Source Oscar Karnalim School of Electrical Engineering and Informatics Bandung Institute of Technology Bandung, Indonesia 23512012@std.stei.itb.ac.id
More informationStudy and Analysis of Object-Oriented Languages using Hybrid Clone Detection Technique
Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 6 (2017) pp. 1635-1649 Research India Publications http://www.ripublication.com Study and Analysis of Object-Oriented
More informationCompiling and running OpenMP programs. C/C++: cc fopenmp o prog prog.c -lomp CC fopenmp o prog prog.c -lomp. Programming with OpenMP*
Advanced OpenMP Compiling and running OpenMP programs C/C++: cc fopenmp o prog prog.c -lomp CC fopenmp o prog prog.c -lomp 2 1 Running Standard environment variable determines the number of threads: tcsh
More informationAutomatic Identification of Important Clones for Refactoring and Tracking
Automatic Identification of Important Clones for Refactoring and Tracking Manishankar Mondal Chanchal K. Roy Kevin A. Schneider Department of Computer Science, University of Saskatchewan, Canada {mshankar.mondal,
More informationLecture 1 - Introduction (Class Notes)
Lecture 1 - Introduction (Class Notes) Outline: How does a computer work? Very brief! What is programming? The evolution of programming languages Generations of programming languages Compiled vs. Interpreted
More informationON AUTOMATICALLY DETECTING SIMILAR ANDROID APPS. By Michelle Dowling
ON AUTOMATICALLY DETECTING SIMILAR ANDROID APPS By Michelle Dowling Motivation Searching for similar mobile apps is becoming increasingly important Looking for substitute apps Opportunistic code reuse
More informationOpera Web Browser Archive - FTP Site Statistics. Top 20 Directories Sorted by Disk Space
Property Value FTP Server ftp.opera.com Description Opera Web Browser Archive Country United States Scan Date 04/Nov/2015 Total Dirs 1,557 Total Files 2,211 Total Data 43.83 GB Top 20 Directories Sorted
More informationA Tree Kernel Based Approach for Clone Detection
A Tree Kernel Based Approach for Clone Detection Anna Corazza 1, Sergio Di Martino 1, Valerio Maggio 1, Giuseppe Scanniello 2 1) University of Naples Federico II 2) University of Basilicata Outline Background
More information2IS55 Software Evolution. Code duplication. Alexander Serebrenik
2IS55 Software Evolution Code duplication Alexander Serebrenik Assignments Assignment 2: March 5, 2013, 23:59. Assignment 3 already open. Code duplication Individual Deadline: March 12, 2013, 23:59. /
More informationClone Detection using Textual and Metric Analysis to figure out all Types of Clones
Detection using Textual and Metric Analysis to figure out all Types of s Kodhai.E 1, Perumal.A 2, and Kanmani.S 3 1 SMVEC, Dept. of Information Technology, Puducherry, India Email: kodhaiej@yahoo.co.in
More informationCS Programming I: Programming Process
CS 200 - Programming I: Programming Process Marc Renault Department of Computer Sciences University of Wisconsin Madison Fall 2017 TopHat Sec 3 (PM) Join Code: 719946 TopHat Sec 4 (AM) Join Code: 891624
More informationLab5. Wooseok Kim
Lab5 Wooseok Kim wkim3@albany.edu www.cs.albany.edu/~wooseok/201 Question Answer Points 1 A or B 8 2 A 8 3 D 8 4 20 5 for class 10 for main 5 points for output 5 D or E 8 6 B 8 7 1 15 8 D 8 9 C 8 10 B
More informationRearranging the Order of Program Statements for Code Clone Detection
Rearranging the Order of Program Statements for Code Clone Detection Yusuke Sabi, Yoshiki Higo, Shinji Kusumoto Graduate School of Information Science and Technology, Osaka University, Japan Email: {y-sabi,higo,kusumoto@ist.osaka-u.ac.jp
More informationProgramming by Delegation
Chapter 2 a Programming by Delegation I. Scott MacKenzie a These slides are mostly based on the course text: Java by abstraction: A client-view approach (4 th edition), H. Roumani (2015). 1 Topics What
More informationCrawling. CS6200: Information Retrieval. Slides by: Jesse Anderton
Crawling CS6200: Information Retrieval Slides by: Jesse Anderton Motivating Problem Internet crawling is discovering web content and downloading it to add to your index. This is a technically complex,
More informationNavigating the Guix Subsystems
Navigating the Guix Subsystems Ludovic Courtès GNU Hackers Meeting, Rennes, August 2016 The Emacs of distros When large numbers of nontechnical workers are using a programmable editor, they will be tempted
More informationTechnical lossless / near lossless data compression
Technical lossless / near lossless data compression Nigel Atkinson (Met Office, UK) ECMWF/EUMETSAT NWP SAF Workshop 5-7 Nov 2013 Contents Survey of file compression tools Studies for AVIRIS imager Study
More informationEfficiently Measuring an Accurate and Generalized Clone Detection Precision using Clone Clustering
Efficiently Measuring an Accurate and Generalized Clone Detection Precision using Clone Clustering Jeffrey Svajlenko Chanchal K. Roy Department of Computer Science, University of Saskatchewan, Saskatoon,
More informationSub-clones: Considering the Part Rather than the Whole
Sub-clones: Considering the Part Rather than the Whole Robert Tairas 1 and Jeff Gray 2 1 Department of Computer and Information Sciences, University of Alabama at Birmingham, Birmingham, AL 2 Department
More informationAccuracy Enhancement in Code Clone Detection Using Advance Normalization
Accuracy Enhancement in Code Clone Detection Using Advance Normalization 1 Ritesh V. Patil, 2 S. D. Joshi, 3 Digvijay A. Ajagekar, 4 Priyanka A. Shirke, 5 Vivek P. Talekar, 6 Shubham D. Bankar 1 Research
More informationStatic Pruning of Terms In Inverted Files
In Inverted Files Roi Blanco and Álvaro Barreiro IRLab University of A Corunna, Spain 29th European Conference on Information Retrieval, Rome, 2007 Motivation : to reduce inverted files size with lossy
More informationAscenium: A Continuously Reconfigurable Architecture. Robert Mykland Founder/CTO August, 2005
Ascenium: A Continuously Reconfigurable Architecture Robert Mykland Founder/CTO robert@ascenium.com August, 2005 Ascenium: A Continuously Reconfigurable Processor Continuously reconfigurable approach provides:
More informationINF 212 ANALYSIS OF PROG. LANGS PLUGINS. Instructors: Crista Lopes Copyright Instructors.
INF 212 ANALYSIS OF PROG. LANGS PLUGINS Instructors: Crista Lopes Copyright Instructors. Modules as conceptual units Modules as physical components Software modules as physical components Source components
More informationOFF-SITE LEARNING SCHEDULE
OFF-SITE LEARNING SCHEDULE (2015-2016 / 2016-2017) FULL TIME STUDIES (ONE YEAR) PACK 01 Preliminary exercise: conventional vs. applied to new technologies October 15, 2015 Translation: theory and methodology
More informationLab5. Wooseok Kim
Lab5 Wooseok Kim wkim3@albany.edu www.cs.albany.edu/~wooseok/201 Question Answer Points 1 A 8 2 A 8 3 E 8 4 D 8 5 20 5 for class 10 for main 5 points for output 6 A 8 7 B 8 8 0 15 9 D 8 10 B 8 Question
More informationJSCTracker: A Tool and Algorithm for Semantic Method Clone Detection
JSCTracker: A Tool and Algorithm for Semantic Method Clone Detection Using Method IOE-Behavior Rochelle Elva and Gary T. Leavens CS-TR-12-07 October 15, 2012 Keywords: Automated semantic clone detection
More informationPlaying Cupid: The IDE as a Matchmaker for Plug-Ins
Playing Cupid: The IDE as a Matchmaker for Plug-Ins Todd W. Schiller and Brandon Lucia Department of Computer Science University of Washington Seattle, Washington {tws,blucia0a}@cs.washington.edu Abstract
More information엄현상 (Eom, Hyeonsang) School of Computer Science and Engineering Seoul National University COPYRIGHTS 2017 EOM, HYEONSANG ALL RIGHTS RESERVED
엄현상 (Eom, Hyeonsang) School of Computer Science and Engineering Seoul National University COPYRIGHTS 2017 EOM, HYEONSANG ALL RIGHTS RESERVED Outline - Questionnaire Results - Java Overview - Java Examples
More informationSoot, a Tool for Analyzing and Transforming Java Bytecode
Soot, a Tool for Analyzing and Transforming Java Bytecode Laurie Hendren, Patrick Lam, Jennifer Lhoták, Ondřej Lhoták and Feng Qian McGill University Special thanks to John Jorgensen and Navindra Umanee
More informationA Survey of Software Clone Detection Techniques
A Survey of Software Detection Techniques Abdullah Sheneamer Department of Computer Science University of Colorado at Colo. Springs, USA Colorado Springs, USA asheneam@uccs.edu Jugal Kalita Department
More informationMeCC: Memory Comparison-based Clone Detector
MeCC: Memory Comparison-based Clone Detector ABSTRACT Heejung Kim Seoul National University hjkim@ropas.snu.ac.kr Sunghun Kim The Hong Kong University of Science and Technology hunkim@cse.ust.hk In this
More informationResearch Article An Empirical Study on the Impact of Duplicate Code
Advances in Software Engineering Volume 212, Article ID 938296, 22 pages doi:1.1155/212/938296 Research Article An Empirical Study on the Impact of Duplicate Code Keisuke Hotta, Yui Sasaki, Yukiko Sano,
More informationOn the Stability of Software Clones: A Genealogy-Based Empirical Study
On the Stability of Software Clones: A Genealogy-Based Empirical Study A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for the degree of Master
More informationBig picture. Definitions. Internal sorting. Exchange sorts. Insertion sort Bubble sort Selection sort Comparison. Comp Sci 1575 Data Structures
Internal sorting Comp Sci 1575 Data Structures Admin notes Advising appointments will eclipse office hours this week, so no guarantees about availability during normal times. With 130 appointments at 15
More informationMining Revision Histories to Detect Cross-Language Clones without Intermediates
Mining Revision Histories to Detect Cross-Language Clones without Intermediates Xiao Cheng 1, Zhiming Peng 2, Lingxiao Jiang 2, Hao Zhong 1, Haibo Yu 3, Jianjun Zhao 4 1 Department of Computer Science
More informationLecture 2. COMP1406/1006 (the Java course) Fall M. Jason Hinek Carleton University
Lecture 2 COMP1406/1006 (the Java course) Fall 2013 M. Jason Hinek Carleton University today s agenda a quick look back (last Thursday) assignment 0 is posted and is due this Friday at 2pm Java compiling
More informationcode pattern analysis of object-oriented programming languages
code pattern analysis of object-oriented programming languages by Xubo Miao A thesis submitted to the School of Computing in conformity with the requirements for the degree of Master of Science Queen s
More informationCost of Your Programs
Department of Computer Science and Engineering Chinese University of Hong Kong In the class, we have defined the RAM computation model. In turn, this allowed us to define rigorously algorithms and their
More informationarxiv: v1 [cs.se] 8 Aug 2017
Cherry-Picking of Code Commits in Long-Running, Multi-release Software University of the Thai Chamber of Commerce panuchart_bun,chadarat_phi@utcc.ac.th arxiv:1708.02393v1 [cs.se] 8 Aug 2017 ABSTRACT This
More information1/30/18. Overview. Code Clones. Code Clone Categorization. Code Clones. Code Clone Categorization. Key Points of Code Clones
Overview Code Clones Definition and categories Clone detection Clone removal refactoring Spiros Mancoridis[1] Modified by Na Meng 2 Code Clones Code clone is a code fragment in source files that is identical
More informationCode Clone Detection on Specialized PDGs with Heuristics
2011 15th European Conference on Software Maintenance and Reengineering Code Clone Detection on Specialized PDGs with Heuristics Yoshiki Higo Graduate School of Information Science and Technology Osaka
More informationProcess Model Improvement for Source Code Plagiarism Detection in Student Programming Assignments
Informatics in Education, 2016, Vol. 15, No. 1, 103 126 2016 Vilnius University DOI: 10.15388/infedu.2016.06 103 Process Model Improvement for Source Code Plagiarism Detection in Student Programming Assignments
More informationThe goal of this project is to enhance the identification of code duplication which can result in high cost reductions for a minimal price.
Code Duplication New Proposal Dolores Zage, Wayne Zage Ball State University June 1, 2017 July 31, 2018 Long Term Goals The goal of this project is to enhance the identification of code duplication which
More information4.1, 4.2 Performance, with Sorting
1 4.1, 4.2 Performance, with Sorting Running Time As soon as an Analytic Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question
More informationSorting. 4.2 Sorting and Searching. Sorting. Sorting. Insertion Sort. Sorting. Sorting problem. Rearrange N items in ascending order.
4.2 and Searching pentrust.org Introduction to Programming in Java: An Interdisciplinary Approach Robert Sedgewick and Kevin Wayne Copyright 2002 2010 23/2/2012 15:04:54 pentrust.org pentrust.org shanghaiscrap.org
More informationGapped Code Clone Detection with Lightweight Source Code Analysis
Gapped Code Clone Detection with Lightweight Source Code Analysis Hiroaki Murakami, Keisuke Hotta, Yoshiki Higo, Hiroshi Igaki, Shinji Kusumoto Graduate School of Information Science and Technology, Osaka
More information