A Heuristic-based Approach to Identify Concepts in Execution Traces

Size: px
Start display at page:

Download "A Heuristic-based Approach to Identify Concepts in Execution Traces"

Transcription

1 A Heuristic-based Approach to Identify Concepts in Execution Traces Fatemeh Asadi * Massimiliano Di Penta ** Giuliano Antoniol * Yann-Gaël Guéhéneuc ** * Ecole Polytechnique de Montréal, Canada ** Dept. Of Engineering Univ. of Sannio, Italy CSMR Madrid (Spain) 1

2 Motivations Software systems lack adequate documentation Developers try to understand systems through Static analyses, visualizations built upon static data Dynamic analyses, requiring the execution of the system (Dynamic) concept identification Identify sets of method calls in execution traces responsible for the implementation of domain concepts or user-observable features Existing approaches based on static analysis [Anquetil and Lethbridge (1998)], dynamic analysis [Wilde and Scully (1995) Tonella and Ceccato (2004)], IR techniques [Poshyvanyk et al. (2007)], or hybrid ones [Eaddy et al. (2008)] CSMR Madrid (Spain) 2

3 Proposed approach A novel approach that analyzes execution traces and groups together method calls that: (i) sequentially invoked together/in sequence (ii) cohesive and decoupled from a conceptual point of view Assumptions Let us consider a feature is being executed in a scenario e.g., Open a Web page from a browser or Save an image in a paint application The set of methods related to the feature is likely to be: (i) conceptually cohesive (ii) decoupled from those of other features (iii) sequentially invoked CSMR Madrid (Spain) 3

4 Proposed approach Step I System instrumentation Step II Execution trace collection Step III Trace pruning and compression Step IV Textual analysis of methods source code Step V Search-based concept identification CSMR Madrid (Spain) 4

5 Step I and Step II Getting Traces Step I - System instrumentation System instrumented using the MoDeC instrumentor MoDeC tool to extract and model sequence diagrams for Java systems Java bytecode instrumentation tool Inserts appropriate and dedicated method invocations in the system to method/constructor entry/exit, points Allows for trace tagging Step II - Execution trace collection We exercise a system following operation sequences taken from user manuals or use case descriptions CSMR Madrid (Spain) 5

6 Step III Trace Pruning and Compression Removing methods not very useful for feature identification Methods occurring in many scenarios Are often utility methods We use the same idea of tf-idf in Information Retrieval Too frequent methods Could be for example related to crosscutting concerns We remove methods having a frequency Q3 + 2 IQR (75% percentile + 2 the interquartile range) Trace compression Aim: collapse repetitions in execution traces Purpose: reduce the search space for Step V Examples: m1(); m1(); m1(); m1(); m2(); m1(); m2(); m1(); m1; m2(); Performed using the Run Length Encoding (RLE) Applied for sub-sequences having an arbitrary length CSMR Madrid (Spain) 6

7 Step IV Conceptual cohesion and coupling determined according to [Marcus et al., 2008] and [Poshyvanyk et al., 2006] Index identifiers, comments contained in methods Extraction of identifiers and comment words Camel-case splitting of composed identifiers Stop word removal (English + Java keywords) Stemming using the Porter stemmer Indexing using tf-idf Reduce the term-document space into a (smaller) conceptdocument space using Latent Semantic Indexing (LSI) Helps to cope with synonymy and homonymy Concept space=50 CSMR Madrid (Spain) 7

8 Step V We use a search-based optimization technique based on Genetic Algorithms (GA) to split traces into segments Representation: a bit-vector where 1 indicates the end of a segment Trace splitting Representation m 1 m 2 m 1 m 3 m 4 m 1 m 4 m 6 m Mutation: randomly flips a bit (i.e., splits or merge segments) Crossover: two-points Selection: Roulette Wheel CSMR Madrid (Spain) 8

9 Step V Quality of the Solution Fitness Function: Segment Cohesion is the average (textual) similarity between any pair of methods in a segment Segment Coupling is the average (textual) similarity between a segment and all other segments in the trace Other GA parameters 200 individuals 2,000 generations for JHotDraw and 3,000 for ArgoUML 5% mutation probability, 70% crossover probability Distributed GA implementation (across 4 servers) CSMR Madrid (Spain) 9

10 Empirical Study Goal: analyze the novel concept location approach based Purpose: of evaluating its capability of identifying meaningful concepts Quality focus: accuracy and completeness of the identified concepts Context: an implementation of our approach and execution traces extracted from two open source systems, JHotDraw and ArgoUML CSMR Madrid (Spain) 10

11 Research Questions RQ1: How stable is the GA, through multiple runs, when identifying concepts into execution traces? RQ2: To what extent the identified concepts match the ones in the oracle? RQ3: How accurate is the identification of concepts in execution traces? CSMR Madrid (Spain) 11

12 RQ1: GA stability We compute the overlap between segmentations obtained in multiple runs using the Jaccard overlap Score Two segments overlaps when they contain calls in the same position of the trace Because a segment of trace T1 overlaps with more segments of T2, the highest similarity is chosen Run 1 Run 2 m 1 m 2 m 1 m 3 m 4 m 1 m 4 m 6 m 1 m 1 m 2 m 1 m 3 m 4 m 1 m 4 m 6 m 1 2/3 2/4 3/4 CSMR Madrid (Spain) 12

13 RQ1: Results Average overlap between 72% and 84% Slightly higher convergence for ArgoUML Ability of the algorithm to converge, despite the relatively large search space CSMR Madrid (Spain) 13

14 RQ2: Matching with the Oracle We manually tag start-end of features while executing the system Using the MoDeC instrumentation tool While executing the instrumented system, the user triggers the introduction of <Start> and <Stop> tags in the trace Matching between identified traces and oracle computed as in RQ1 Run 1 Oracle m 1 m 2 m 1 m 3 m 4 m 1 m 4 m 6 m 1 m 1 m 2 m 1 m 3 m 4 m 1 m 4 m 6 m 1 2/3 2/4 3/4 CSMR Madrid (Spain) 14

15 RQ2: Results High overlap for some features e.g., Draw rectangle or Draw circle Lower for features obtained adapting other ones e.g., Add text obtained adapting Draw rectangle In other cases, low overlap is due to large segments split into more smaller and cohesive ones CSMR Madrid (Spain) 15

16 RQ3: Accuracy in trace identification Computed similarly to RQ2, however we use Precision instead of Jaccard overlap Score Run 1 Oracle m 1 m 2 m 1 m 3 m 4 m 1 m 4 m 6 m 1 m 1 m 2 m 1 m 3 m 4 m 1 m 4 m 6 m 1 2/2 2/3 3/4 CSMR Madrid (Spain) 16

17 RQ3: Results Precision often very high In most cases above 85% and often equal to 100% Low precision (mean 32%) for Add text Relatively low (mean 69%) for Draw rectangle These two features are difficult to be distinguished CSMR Madrid (Spain) 17

18 Inspection of the obtained segments Add class (ArgoUML) The approach split this long feature of 199 methods sequence into 5 segments related to sub-features (creation of objects, adding the project class, handling namespace, setting object properties, handling persistence of the diagram) Create note (ArgoUML) Only the first part (50 methods) of the trace composed of 88 calls was identified Problems related to multi-threading Problems related to collapsing (during compression) loops containing variants Cut rectangle (JHotDraw) Only the last 39 out of 172 calls were included in the segment Methods related to adding to the clipboard and showing the rectangle as cut First methods related to GUI events and split in many small segments Spawn window (JHotDraw) 72 out of 197 methods included The remaining ones were related to setting up menu command properties CSMR Madrid (Spain) 18

19 Threats to Validity Construct validity (relation btw. theory and observation) Multi-threading can change the ordering of calls in multiple executions of the same scenario A better assessment of the actual content of the obtained segments is needed Internal validity (presence of confounding factors) Trace tagging may be imprecise, again due to multi-threading Noise due to utility methods GA intrinsic randomness External validity (generalization of findings) We analyzed two different systems, multiple traces As usual, further empirical evaluation is needed CSMR Madrid (Spain) 19

20 Conclusions We proposed a search-based approach to automatically locate concepts in execution traces By splitting traces into conceptually cohesive and decoupled segments Empirical study on traces from JHotDraw and ArgoUML shows that The approach is stable Identified segments highly precise Finer-splitting wrt. high-level features Limitations due to: multi-threading, GUI events, feature adaptation.. Work-in-progress: Improve performance Use enhanced compression techniques Automatically label identified concepts Perform an extensive empirical validation CSMR Madrid (Spain) 20

21 Thank You! Questions? CSMR Madrid (Spain) 21

Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification

Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification Denys Poshyvanyk, Yann-Gaël Guéhéneuc, Andrian Marcus, Giuliano Antoniol, Václav Rajlich 14 th IEEE International

More information

Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic Algorithms

Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic Algorithms Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic Algorithms Annibale Panichella 1, Bogdan Dit 2, Rocco Oliveto 3, Massimiliano Di Penta 4, Denys Poshyvanyk 5, Andrea De Lucia

More information

Java Archives Search Engine Using Byte Code as Information Source

Java Archives Search Engine Using Byte Code as Information Source Java Archives Search Engine Using Byte Code as Information Source Oscar Karnalim School of Electrical Engineering and Informatics Bandung Institute of Technology Bandung, Indonesia 23512012@std.stei.itb.ac.id

More information

Configuring Topic Models for Software Engineering Tasks in TraceLab

Configuring Topic Models for Software Engineering Tasks in TraceLab Configuring Topic Models for Software Engineering Tasks in TraceLab Bogdan Dit Annibale Panichella Evan Moritz Rocco Oliveto Massimiliano Di Penta Denys Poshyvanyk Andrea De Lucia TEFSE 13 San Francisco,

More information

An Approach for Mapping Features to Code Based on Static and Dynamic Analysis

An Approach for Mapping Features to Code Based on Static and Dynamic Analysis An Approach for Mapping Features to Code Based on Static and Dynamic Analysis Abhishek Rohatgi 1, Abdelwahab Hamou-Lhadj 2, Juergen Rilling 1 1 Department of Computer Science and Software Engineering 2

More information

TopicViewer: Evaluating Remodularizations Using Semantic Clustering

TopicViewer: Evaluating Remodularizations Using Semantic Clustering TopicViewer: Evaluating Remodularizations Using Semantic Clustering Gustavo Jansen de S. Santos 1, Katyusco de F. Santos 2, Marco Tulio Valente 1, Dalton D. S. Guerrero 3, Nicolas Anquetil 4 1 Federal

More information

Where Should the Bugs Be Fixed?

Where Should the Bugs Be Fixed? Where Should the Bugs Be Fixed? More Accurate Information Retrieval-Based Bug Localization Based on Bug Reports Presented by: Chandani Shrestha For CS 6704 class About the Paper and the Authors Publication

More information

Chapter 6: Information Retrieval and Web Search. An introduction

Chapter 6: Information Retrieval and Web Search. An introduction Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods

More information

Can Better Identifier Splitting Techniques Help Feature Location?

Can Better Identifier Splitting Techniques Help Feature Location? Can Better Identifier Splitting Techniques Help Feature Location? Bogdan Dit, Latifa Guerrouj, Denys Poshyvanyk, Giuliano Antoniol SEMERU 19 th IEEE International Conference on Program Comprehension (ICPC

More information

Configuring Topic Models for Software Engineering Tasks in TraceLab

Configuring Topic Models for Software Engineering Tasks in TraceLab Configuring Topic Models for Software Engineering Tasks in TraceLab Bogdan Dit 1, Annibale Panichella 2, Evan Moritz 1, Rocco Oliveto 3, Massimiliano Di Penta 4, Denys Poshyvanyk 1, Andrea De Lucia 2 1

More information

Reuse or Rewrite: Combining Textual, Static, and Dynamic Analyses to Assess the Cost of Keeping a System Up-to-date

Reuse or Rewrite: Combining Textual, Static, and Dynamic Analyses to Assess the Cost of Keeping a System Up-to-date Reuse or Rewrite: Combining Textual, Static, and Dynamic Analyses to Assess the Cost of Keeping a System Up-to-date Giuliano Antoniol Jane Huffman Hayes Yann-Gaël Guéhéneuc Massimiliano di Penta Dépt.

More information

Key Properties for Comparing Modeling Languages and Tools: Usability, Completeness and Scalability

Key Properties for Comparing Modeling Languages and Tools: Usability, Completeness and Scalability Key Properties for Comparing Modeling Languages and Tools: Usability, Completeness and Scalability Timothy C. Lethbridge Department of Electrical Engineering and Computer Science, University of Ottawa

More information

CORRELATING FEATURES AND CODE BY DYNAMIC

CORRELATING FEATURES AND CODE BY DYNAMIC CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS Ren Wu Shanghai Lixin University of Commerce, Shanghai 201620, China ABSTRACT One major problem in maintaining a software system is to understand

More information

A Textual-based Technique for Smell Detection

A Textual-based Technique for Smell Detection A Textual-based Technique for Smell Detection Fabio Palomba1, Annibale Panichella2, Andrea De Lucia1, Rocco Oliveto3, Andy Zaidman2 1 University of Salerno, Italy 2 Delft University of Technology, The

More information

Search-Based Software Maintenance and Evolution

Search-Based Software Maintenance and Evolution Search-Based Software Maintenance and Evolution Annibale Panichella Advisor: Prof. Andrea De Lucia Dott. Rocco Oliveto Search-Based Software Engineering «The application of meta-heuristic search-based

More information

An Efficient Approach for Requirement Traceability Integrated With Software Repository

An Efficient Approach for Requirement Traceability Integrated With Software Repository An Efficient Approach for Requirement Traceability Integrated With Software Repository P.M.G.Jegathambal, N.Balaji P.G Student, Tagore Engineering College, Chennai, India 1 Asst. Professor, Tagore Engineering

More information

An Approach for Detecting Execution Phases of a System for the Purpose of Program Comprehension

An Approach for Detecting Execution Phases of a System for the Purpose of Program Comprehension An Approach for Detecting Execution Phases of a System for the Purpose of Program Comprehension Heidar Pirzadeh, Akanksha Agarwal, Abdelwahab Hamou-Lhadj Department of Electrical and Computer Engineering

More information

A Text Retrieval Approach to Recover Links among s and Source Code Classes

A Text Retrieval Approach to Recover Links among  s and Source Code Classes 318 A Text Retrieval Approach to Recover Links among E-Mails and Source Code Classes Giuseppe Scanniello and Licio Mazzeo Universitá della Basilicata, Macchia Romana, Viale Dell Ateneo, 85100, Potenza,

More information

Domain Knowledge Driven Program Analysis

Domain Knowledge Driven Program Analysis Domain Knowledge Driven Program Analysis Daniel Ratiu http://www4.in.tum.de/~ratiu/knowledge_repository.html WSR Bad-Honnef, 4 May 2009 Pressing Issues Program understanding is expensive - over 60% of

More information

AURA: A Hybrid Approach to Identify

AURA: A Hybrid Approach to Identify : A Hybrid to Identify Wei Wu 1, Yann-Gaël Guéhéneuc 1, Giuliano Antoniol 2, and Miryung Kim 3 1 Ptidej Team, DGIGL, École Polytechnique de Montréal, Canada 2 SOCCER Lab, DGIGL, École Polytechnique de

More information

Search Results Clustering in Polish: Evaluation of Carrot

Search Results Clustering in Polish: Evaluation of Carrot Search Results Clustering in Polish: Evaluation of Carrot DAWID WEISS JERZY STEFANOWSKI Institute of Computing Science Poznań University of Technology Introduction search engines tools of everyday use

More information

A Statement Level Bug Localization Technique using Statement Dependency Graph

A Statement Level Bug Localization Technique using Statement Dependency Graph A Statement Level Bug Localization Technique using Statement Dependency Graph Shanto Rahman, Md. Mostafijur Rahman, Ahmad Tahmid and Kazi Sakib Institute of Information Technology, University of Dhaka,

More information

An Efficient Approach for Requirement Traceability Integrated With Software Repository

An Efficient Approach for Requirement Traceability Integrated With Software Repository IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 15, Issue 4 (Nov. - Dec. 2013), PP 65-71 An Efficient Approach for Requirement Traceability Integrated With Software

More information

Just-In-Time Compilation

Just-In-Time Compilation Just-In-Time Compilation Thiemo Bucciarelli Institute for Software Engineering and Programming Languages 18. Januar 2016 T. Bucciarelli 18. Januar 2016 1/25 Agenda Definitions Just-In-Time Compilation

More information

Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search

Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search 1 / 33 Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search Bernd Wittefeld Supervisor Markus Löckelt 20. July 2012 2 / 33 Teaser - Google Web History http://www.google.com/history

More information

FLAT 3 : Feature Location & Textual Tracing Tool

FLAT 3 : Feature Location & Textual Tracing Tool FLAT 3 : Feature Location & Textual Tracing Tool Trevor Savage, Meghan Revelle, Denys Poshyvanyk SEMERU Group @ William and Mary Addressed Problem The software developer has to maintain large software

More information

Automatic Reconstruction of the Underlying Interaction Design of Web Applications

Automatic Reconstruction of the Underlying Interaction Design of Web Applications Automatic Reconstruction of the Underlying Interaction Design of Web Applications L.Paganelli, F.Paternò C.N.R., Pisa Via G.Moruzzi 1 {laila.paganelli, fabio.paterno}@cnuce.cnr.it ABSTRACT In this paper

More information

International Journal of Software Engineering and Knowledge Engineering c World Scientific Publishing Company

International Journal of Software Engineering and Knowledge Engineering c World Scientific Publishing Company International Journal of Software Engineering and Knowledge Engineering c World Scientific Publishing Company Dynamic Knowledge Extraction from Software Systems using Sequential Pattern Mining Kamran Sartipi

More information

Numerical Summaries of Data Section 14.3

Numerical Summaries of Data Section 14.3 MATH 11008: Numerical Summaries of Data Section 14.3 MEAN mean: The mean (or average) of a set of numbers is computed by determining the sum of all the numbers and dividing by the total number of observations.

More information

Using Information Retrieval to Support Software Evolution

Using Information Retrieval to Support Software Evolution Using Information Retrieval to Support Software Evolution Denys Poshyvanyk Ph.D. Candidate SEVERE Group @ Software is Everywhere Software is pervading every aspect of life Software is difficult to make

More information

Putting the Developer in-the-loop: an Interactive GA for Software Re-Modularization

Putting the Developer in-the-loop: an Interactive GA for Software Re-Modularization Putting the Developer in-the-loop: an Interactive GA for Software Re-Modularization Gabriele Bavota 1, Filomena Carnevale 1, Andrea De Lucia 1 Massimiliano Di Penta 2, Rocco Oliveto 3 1 University of Salerno,

More information

SERG. Refactoring Fat Interfaces Using a Genetic Algorithm. Delft University of Technology Software Engineering Research Group Technical Report Series

SERG. Refactoring Fat Interfaces Using a Genetic Algorithm. Delft University of Technology Software Engineering Research Group Technical Report Series Delft University of Technology Software Engineering Research Group Technical Report Series Refactoring Fat Interfaces Using a Genetic Algorithm Daniele Romano, Steven Raemaekers, and Martin Pinzger Report

More information

Improving Bug Management using Correlations in Crash Reports

Improving Bug Management using Correlations in Crash Reports Noname manuscript No. (will be inserted by the editor) Improving Bug Management using Correlations in Crash Reports Shaohua Wang Foutse Khomh Ying Zou Received: date / Accepted: date Abstract Nowadays,

More information

What is this Song About?: Identification of Keywords in Bollywood Lyrics

What is this Song About?: Identification of Keywords in Bollywood Lyrics What is this Song About?: Identification of Keywords in Bollywood Lyrics by Drushti Apoorva G, Kritik Mathur, Priyansh Agrawal, Radhika Mamidi in 19th International Conference on Computational Linguistics

More information

Genetic Algorithms Variations and Implementation Issues

Genetic Algorithms Variations and Implementation Issues Genetic Algorithms Variations and Implementation Issues CS 431 Advanced Topics in AI Classic Genetic Algorithms GAs as proposed by Holland had the following properties: Randomly generated population Binary

More information

Static Pruning of Terms In Inverted Files

Static Pruning of Terms In Inverted Files In Inverted Files Roi Blanco and Álvaro Barreiro IRLab University of A Corunna, Spain 29th European Conference on Information Retrieval, Rome, 2007 Motivation : to reduce inverted files size with lossy

More information

Application of Execution Pattern Mining and Concept Lattice Analysis on Software Structure Evaluation

Application of Execution Pattern Mining and Concept Lattice Analysis on Software Structure Evaluation Application of Execution Pattern Mining and Concept Lattice Analysis on Software Structure Evaluation Kamran Sartipi and Hossein Safyallah Dept. Computing and Software, McMaster University Hamilton, ON,

More information

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd Chapter 3: Data Description - Part 3 Read: Sections 1 through 5 pp 92-149 Work the following text examples: Section 3.2, 3-1 through 3-17 Section 3.3, 3-22 through 3.28, 3-42 through 3.82 Section 3.4,

More information

The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm

The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm The Use of Development History in Software Refactoring Using a Multi-Objective Evolutionary Algorithm Ali Ouni 1,2, Marouane Kessentini 2, Houari Sahraoui 1, and Mohamed Salah Hamdi 3 1 DIRO, Université

More information

MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS

MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS J.I. Serrano M.D. Del Castillo Instituto de Automática Industrial CSIC. Ctra. Campo Real km.0 200. La Poveda. Arganda del Rey. 28500

More information

5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing. 6. Meta-heuristic Algorithms and Rectangular Packing

5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing. 6. Meta-heuristic Algorithms and Rectangular Packing 1. Introduction 2. Cutting and Packing Problems 3. Optimisation Techniques 4. Automated Packing Techniques 5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing 6.

More information

Using Semantic Similarity in Crawling-based Web Application Testing. (National Taiwan Univ.)

Using Semantic Similarity in Crawling-based Web Application Testing. (National Taiwan Univ.) Using Semantic Similarity in Crawling-based Web Application Testing Jun-Wei Lin Farn Wang Paul Chu (UC-Irvine) (National Taiwan Univ.) (QNAP, Inc) Crawling-based Web App Testing the web app under test

More information

Part I: Preliminaries 24

Part I: Preliminaries 24 Contents Preface......................................... 15 Acknowledgements................................... 22 Part I: Preliminaries 24 1. Basics of Software Testing 25 1.1. Humans, errors, and testing.............................

More information

Multimedia Information Extraction and Retrieval Term Frequency Inverse Document Frequency

Multimedia Information Extraction and Retrieval Term Frequency Inverse Document Frequency Multimedia Information Extraction and Retrieval Term Frequency Inverse Document Frequency Ralf Moeller Hamburg Univ. of Technology Acknowledgement Slides taken from presentation material for the following

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

CODES User Guide. October 2013 Version 1.0. Gerardo Canfora, University of Sannio (Italy) Massimiliano Di Penta, University of Sannio (Italy)

CODES User Guide. October 2013 Version 1.0. Gerardo Canfora, University of Sannio (Italy) Massimiliano Di Penta, University of Sannio (Italy) CODES User Guide October 2013 Version 1.0 Gerardo Canfora, University of Sannio (Italy) Massimiliano Di Penta, University of Sannio (Italy) Sebastiano Panichella, University of Sannio (Italy) Carmine Vassallo,

More information

Empirical Studies of Test Case Prioritization in a JUnit Testing Environment

Empirical Studies of Test Case Prioritization in a JUnit Testing Environment University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Conference and Workshop Papers Computer Science and Engineering, Department of 2004 Empirical Studies of Test Case Prioritization

More information

A Improving Software Modularization via Automated Analysis of Latent Topics and Dependencies

A Improving Software Modularization via Automated Analysis of Latent Topics and Dependencies A Improving Software Modularization via Automated Analysis of Latent Topics and Dependencies GABRIELE BAVOTA, University of Salerno, Italy MALCOM GETHERS, University of Maryland, Baltimore County, USA

More information

Re-engineering Software Variants into Software Product Line

Re-engineering Software Variants into Software Product Line Re-engineering Software Variants into Software Product Line Présentation extraite de la soutenance de thèse de M. Ra'Fat AL-Msie'Deen University of Montpellier Software product variants 1. Software product

More information

Digital Libraries: Language Technologies

Digital Libraries: Language Technologies Digital Libraries: Language Technologies RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Recall: Inverted Index..........................................

More information

Extractive Text Summarization Techniques

Extractive Text Summarization Techniques Extractive Text Summarization Techniques Tobias Elßner Hauptseminar NLP Tools 06.02.2018 Tobias Elßner Extractive Text Summarization Overview Rough classification (Gupta and Lehal (2010)): Supervised vs.

More information

Recovering Traceability Links between Code and Documentation

Recovering Traceability Links between Code and Documentation Recovering Traceability Links between Code and Documentation Paper by: Giuliano Antoniol, Gerardo Canfora, Gerardo Casazza, Andrea De Lucia, and Ettore Merlo Presentation by: Brice Dobry and Geoff Gerfin

More information

Towards Cohesion-based Metrics as Early Quality Indicators of Faulty Classes and Components

Towards Cohesion-based Metrics as Early Quality Indicators of Faulty Classes and Components 2009 International Symposium on Computing, Communication, and Control (ISCCC 2009) Proc.of CSIT vol.1 (2011) (2011) IACSIT Press, Singapore Towards Cohesion-based Metrics as Early Quality Indicators of

More information

MAINTENANCE of legacy software involves costly and

MAINTENANCE of legacy software involves costly and IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 32, NO. 9, SEPTEMBER 2006 627 Feature Identification: An Epidemiological Metaphor Giuliano Antoniol and Yann-Gaël Guéhéneuc Abstract Feature identification

More information

Mining Features from the Object-Oriented Source Code of a Collection of Software Variants Using Formal Concept Analysis and Latent Semantic Indexing

Mining Features from the Object-Oriented Source Code of a Collection of Software Variants Using Formal Concept Analysis and Latent Semantic Indexing Mining Features from the Object-Oriented Source Code of a Collection of Software Variants Using Formal Concept Analysis and Latent Semantic Indexing R. AL-msie deen 1, A.-D. Seriai 1, M. Huchard 1, C.

More information

Feature Location via Information Retrieval based Filtering of a Single Scenario Execution Trace

Feature Location via Information Retrieval based Filtering of a Single Scenario Execution Trace Feature Location via Information Retrieval based Filtering of a Single Scenario Execution Trace Dapeng Liu, Andrian Marcus, Denys Poshyvanyk, Václav Rajlich SEVERE Group @ Incremental Change of Software

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

Box and Whisker Plot Review A Five Number Summary. October 16, Box and Whisker Lesson.notebook. Oct 14 5:21 PM. Oct 14 5:21 PM.

Box and Whisker Plot Review A Five Number Summary. October 16, Box and Whisker Lesson.notebook. Oct 14 5:21 PM. Oct 14 5:21 PM. Oct 14 5:21 PM Oct 14 5:21 PM Box and Whisker Plot Review A Five Number Summary Activities Practice Labeling Title Page 1 Click on each word to view its definition. Outlier Median Lower Extreme Upper Extreme

More information

Tag-based Social Interest Discovery

Tag-based Social Interest Discovery Tag-based Social Interest Discovery Xin Li / Lei Guo / Yihong (Eric) Zhao Yahoo!Inc 2008 Presented by: Tuan Anh Le (aletuan@vub.ac.be) 1 Outline Introduction Data set collection & Pre-processing Architecture

More information

Deterministic Parallel Programming

Deterministic Parallel Programming Deterministic Parallel Programming Concepts and Practices 04/2011 1 How hard is parallel programming What s the result of this program? What is data race? Should data races be allowed? Initially x = 0

More information

UNIT-IV BASIC BEHAVIORAL MODELING-I

UNIT-IV BASIC BEHAVIORAL MODELING-I UNIT-IV BASIC BEHAVIORAL MODELING-I CONTENTS 1. Interactions Terms and Concepts Modeling Techniques 2. Interaction Diagrams Terms and Concepts Modeling Techniques Interactions: Terms and Concepts: An interaction

More information

@%'#*+#*+1((1&0(1*#"A*!'#",*.(&/6(/&0-*0"+*.%40"(#6*7"81&40(#1"*(1*./331&(*.18(B0&%*9%806(1&#",* H;2*>0"+#+0(%* C""1*C660+%4#61*DEFG* I0J&#%-%*K0$1(0*

@%'#*+#*+1((1&0(1*#A*!'#,*.(&/6(/&0-*0+*.%40(#6*781&40(#1*(1*./331&(*.18(B0&%*9%806(1&#,* H;2*>0+#+0(%* C1*C660+%4#61*DEFG* I0J&#%-%*K0$1(0* !"#$%&'#()*+%,-#*.(/+#*+#*.0-%&"1* 2#30&(#4%"(1*+#*50(%40(#60*%+*7"81&40(#60* 21((1&0(1*+#*9#6%&60*#"*.6#%":%*50(%40(#6;%

More information

CANDIDATE LINK GENERATION USING SEMANTIC PHEROMONE SWARM

CANDIDATE LINK GENERATION USING SEMANTIC PHEROMONE SWARM CANDIDATE LINK GENERATION USING SEMANTIC PHEROMONE SWARM Ms.Susan Geethu.D.K 1, Ms. R.Subha 2, Dr.S.Palaniswami 3 1, 2 Assistant Professor 1,2 Department of Computer Science and Engineering, Sri Krishna

More information

Version History, Similar Report, and Structure: Putting Them Together for Improved Bug Localization

Version History, Similar Report, and Structure: Putting Them Together for Improved Bug Localization Version History, Similar Report, and Structure: Putting Them Together for Improved Bug Localization Shaowei Wang and David Lo School of Information Systems Singapore Management University, Singapore {shaoweiwang.2010,davidlo}@smu.edu.sg

More information

Mapping Bug Reports to Relevant Files and Automated Bug Assigning to the Developer Alphy Jose*, Aby Abahai T ABSTRACT I.

Mapping Bug Reports to Relevant Files and Automated Bug Assigning to the Developer Alphy Jose*, Aby Abahai T ABSTRACT I. International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Mapping Bug Reports to Relevant Files and Automated

More information

A Genetic Algorithm for Multiprocessor Task Scheduling

A Genetic Algorithm for Multiprocessor Task Scheduling A Genetic Algorithm for Multiprocessor Task Scheduling Tashniba Kaiser, Olawale Jegede, Ken Ferens, Douglas Buchanan Dept. of Electrical and Computer Engineering, University of Manitoba, Winnipeg, MB,

More information

DETERMINING MAXIMUM/MINIMUM VALUES FOR TWO- DIMENTIONAL MATHMATICLE FUNCTIONS USING RANDOM CREOSSOVER TECHNIQUES

DETERMINING MAXIMUM/MINIMUM VALUES FOR TWO- DIMENTIONAL MATHMATICLE FUNCTIONS USING RANDOM CREOSSOVER TECHNIQUES DETERMINING MAXIMUM/MINIMUM VALUES FOR TWO- DIMENTIONAL MATHMATICLE FUNCTIONS USING RANDOM CREOSSOVER TECHNIQUES SHIHADEH ALQRAINY. Department of Software Engineering, Albalqa Applied University. E-mail:

More information

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and

More information

Identifying Changed Source Code Lines from Version Repositories

Identifying Changed Source Code Lines from Version Repositories Identifying Changed Source Code Lines from Version Repositories Gerardo Canfora, Luigi Cerulo, Massimiliano Di Penta RCOST Research Centre on Software Technology Department of Engineering - University

More information

Too Long; Didn t Watch! Extracting Relevant Fragments from Software Development Video Tutorials. Presented by Chris Budiman October 6, 2016

Too Long; Didn t Watch! Extracting Relevant Fragments from Software Development Video Tutorials. Presented by Chris Budiman October 6, 2016 Too Long; Didn t Watch! Extracting Relevant Fragments from Software Development Video Tutorials Presented by Chris Budiman October 6, 2016 Problem Statement In many cases, video tutorials are lengthy,

More information

SUITABLE CONFIGURATION OF EVOLUTIONARY ALGORITHM AS BASIS FOR EFFICIENT PROCESS PLANNING TOOL

SUITABLE CONFIGURATION OF EVOLUTIONARY ALGORITHM AS BASIS FOR EFFICIENT PROCESS PLANNING TOOL DAAAM INTERNATIONAL SCIENTIFIC BOOK 2015 pp. 135-142 Chapter 12 SUITABLE CONFIGURATION OF EVOLUTIONARY ALGORITHM AS BASIS FOR EFFICIENT PROCESS PLANNING TOOL JANKOWSKI, T. Abstract: The paper presents

More information

Measuring the Semantic Similarity of Comments in Bug Reports

Measuring the Semantic Similarity of Comments in Bug Reports Measuring the Semantic Similarity of Comments in Bug Reports Bogdan Dit, Denys Poshyvanyk, Andrian Marcus Department of Computer Science Wayne State University Detroit Michigan 48202 313 577 5408

More information

On the Impact of Refactoring Operations on Code Quality Metrics

On the Impact of Refactoring Operations on Code Quality Metrics On the Impact of Refactoring Operations on Code Quality Metrics Oscar Chaparro 1, Gabriele Bavota 2, Andrian Marcus 1, Massimiliano Di Penta 2 1 University of Texas at Dallas, Richardson, TX 75080, USA

More information

ResPubliQA 2010

ResPubliQA 2010 SZTAKI @ ResPubliQA 2010 David Mark Nemeskey Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest, Hungary (SZTAKI) Abstract. This paper summarizes the results of our first

More information

Remodularization Analysis using Semantic Clustering

Remodularization Analysis using Semantic Clustering Remodularization Analysis using Semantic Clustering Gustavo Santos and Marco Tulio Valente Department of Computer Science UFMG, Brazil {gustavojss, mtov}@dcc.ufmg.br Nicolas Anquetil RMoD Team INRIA, Lille,

More information

Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data

Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data American Journal of Applied Sciences (): -, ISSN -99 Science Publications Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data Ibrahiem M.M. El Emary and Ja'far

More information

Introduction to Genetic Algorithms. Based on Chapter 10 of Marsland Chapter 9 of Mitchell

Introduction to Genetic Algorithms. Based on Chapter 10 of Marsland Chapter 9 of Mitchell Introduction to Genetic Algorithms Based on Chapter 10 of Marsland Chapter 9 of Mitchell Genetic Algorithms - History Pioneered by John Holland in the 1970s Became popular in the late 1980s Based on ideas

More information

Transactum Business Process Manager with High-Performance Elastic Scaling. November 2011 Ivan Klianev

Transactum Business Process Manager with High-Performance Elastic Scaling. November 2011 Ivan Klianev Transactum Business Process Manager with High-Performance Elastic Scaling November 2011 Ivan Klianev Transactum BPM serves three primary objectives: To make it possible for developers unfamiliar with distributed

More information

Component ranking and Automatic Query Refinement for XML Retrieval

Component ranking and Automatic Query Refinement for XML Retrieval Component ranking and Automatic uery Refinement for XML Retrieval Yosi Mass, Matan Mandelbrod IBM Research Lab Haifa 31905, Israel {yosimass, matan}@il.ibm.com Abstract ueries over XML documents challenge

More information

V.Petridis, S. Kazarlis and A. Papaikonomou

V.Petridis, S. Kazarlis and A. Papaikonomou Proceedings of IJCNN 93, p.p. 276-279, Oct. 993, Nagoya, Japan. A GENETIC ALGORITHM FOR TRAINING RECURRENT NEURAL NETWORKS V.Petridis, S. Kazarlis and A. Papaikonomou Dept. of Electrical Eng. Faculty of

More information

Interactions A link message

Interactions A link message Interactions An interaction is a behavior that is composed of a set of messages exchanged among a set of objects within a context to accomplish a purpose. A message specifies the communication between

More information

Evolutionary Computation, 2018/2019 Programming assignment 3

Evolutionary Computation, 2018/2019 Programming assignment 3 Evolutionary Computation, 018/019 Programming assignment 3 Important information Deadline: /Oct/018, 3:59. All problems must be submitted through Mooshak. Please go to http://mooshak.deei.fct.ualg.pt/~mooshak/

More information

Genetic Algorithms: Setting Parmeters and Incorporating Constraints OUTLINE OF TOPICS: 1. Setting GA parameters. 2. Constraint Handling (two methods)

Genetic Algorithms: Setting Parmeters and Incorporating Constraints OUTLINE OF TOPICS: 1. Setting GA parameters. 2. Constraint Handling (two methods) Genetic Algorithms: Setting Parmeters and Incorporating Constraints OUTLINE OF TOPICS: 1. Setting GA parameters general guidelines for binary coded GA (some can be extended to real valued GA) estimating

More information

CMSC 476/676 Information Retrieval Midterm Exam Spring 2014

CMSC 476/676 Information Retrieval Midterm Exam Spring 2014 CMSC 476/676 Information Retrieval Midterm Exam Spring 2014 Name: You may consult your notes and/or your textbook. This is a 75 minute, in class exam. If there is information missing in any of the question

More information

Using FCA to Suggest Refactorings to Correct Design Defects

Using FCA to Suggest Refactorings to Correct Design Defects Using FCA to Suggest Refactorings to Correct Design Defects Naouel Moha, Jihene Rezgui, Yann-Gaël Guéhéneuc, Petko Valtchev, and Ghizlane El Boussaidi GEODES, Department of Informatics and Operations Research

More information

Multi-Objective Optimization for Software Refactoring and Evolution

Multi-Objective Optimization for Software Refactoring and Evolution Multi-Objective Optimization for Software Refactoring and Evolution Research Proposal in partial fulfillment of the requirements for the degree Philosophiæ Doctor (Ph.D.) in computer science Ali Ouni Advisor

More information

Information Fusion Dr. B. K. Panigrahi

Information Fusion Dr. B. K. Panigrahi Information Fusion By Dr. B. K. Panigrahi Asst. Professor Department of Electrical Engineering IIT Delhi, New Delhi-110016 01/12/2007 1 Introduction Classification OUTLINE K-fold cross Validation Feature

More information

modern database systems lecture 4 : information retrieval

modern database systems lecture 4 : information retrieval modern database systems lecture 4 : information retrieval Aristides Gionis Michael Mathioudakis spring 2016 in perspective structured data relational data RDBMS MySQL semi-structured data data-graph representation

More information

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY AN EFFICIENT APPROACH FOR TEXT MINING USING SIDE INFORMATION Kiran V. Gaidhane*, Prof. L. H. Patil, Prof. C. U. Chouhan DOI: 10.5281/zenodo.58632

More information

RESOLVING AMBIGUITIES IN PREPOSITION PHRASE USING GENETIC ALGORITHM

RESOLVING AMBIGUITIES IN PREPOSITION PHRASE USING GENETIC ALGORITHM International Journal of Computer Engineering and Applications, Volume VIII, Issue III, December 14 RESOLVING AMBIGUITIES IN PREPOSITION PHRASE USING GENETIC ALGORITHM Department of Computer Engineering,

More information

Advanced Matching Technique for Trustrace To Improve The Accuracy Of Requirement

Advanced Matching Technique for Trustrace To Improve The Accuracy Of Requirement Advanced Matching Technique for Trustrace To Improve The Accuracy Of Requirement S.Muthamizharasi 1, J.Selvakumar 2, M.Rajaram 3 PG Scholar, Dept of CSE (PG)-ME (Software Engineering), Sri Ramakrishna

More information

Ranking Web Pages by Associating Keywords with Locations

Ranking Web Pages by Associating Keywords with Locations Ranking Web Pages by Associating Keywords with Locations Peiquan Jin, Xiaoxiang Zhang, Qingqing Zhang, Sheng Lin, and Lihua Yue University of Science and Technology of China, 230027, Hefei, China jpq@ustc.edu.cn

More information

Integrated Impact Analysis for Managing Software Changes. Malcom Gethers, Bogdan Dit, Huzefa Kagdi, Denys Poshyvanyk

Integrated Impact Analysis for Managing Software Changes. Malcom Gethers, Bogdan Dit, Huzefa Kagdi, Denys Poshyvanyk Integrated Impact Analysis for Managing Software Changes Malcom Gethers, Bogdan Dit, Huzefa Kagdi, Denys Poshyvanyk Change Impact Analysis Software change impact analysis aims at estimating the potentially

More information

TESTBEDS Paris

TESTBEDS Paris TESTBEDS 2010 - Paris Rich Internet Application Testing Using Execution Trace Data Dipartimento di Informatica e Sistemistica Università di Napoli, Federico II Naples, Italy Domenico Amalfitano Anna Rita

More information

Dynamic Analysis and Design Pattern Detection in Java Programs

Dynamic Analysis and Design Pattern Detection in Java Programs Dynamic Analysis and Design Pattern Detection in Java Programs Lei Hu and Kamran Sartipi Dept. Computing and Software, McMaster University, Hamilton, ON, L8S 4K1, Canada {hu14, sartipi}@mcmaster.ca Abstract

More information

Process Modelling using Petri Nets

Process Modelling using Petri Nets Process Modelling using Petri Nets Katalina Grigorova Abstract: This paper discusses the reasons, which impose Petri nets as a conceptual standard for modelling and analysis of workflow. Petri nets notation

More information

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra Pattern Recall Analysis of the Hopfield Neural Network with a Genetic Algorithm Susmita Mohapatra Department of Computer Science, Utkal University, India Abstract: This paper is focused on the implementation

More information

Suppose you have a problem You don t know how to solve it What can you do? Can you use a computer to somehow find a solution for you?

Suppose you have a problem You don t know how to solve it What can you do? Can you use a computer to somehow find a solution for you? Gurjit Randhawa Suppose you have a problem You don t know how to solve it What can you do? Can you use a computer to somehow find a solution for you? This would be nice! Can it be done? A blind generate

More information

A Reinforcement Learning Approach to Automated GUI Robustness Testing

A Reinforcement Learning Approach to Automated GUI Robustness Testing A Reinforcement Learning Approach to Automated GUI Robustness Testing Sebastian Bauersfeld and Tanja E. J. Vos Universitat Politècnica de València, Camino de Vera s/n, 46022, Valencia, Spain {sbauersfeld,tvos}@pros.upv.es

More information

XRay Views: Understanding the Internals of Classes

XRay Views: Understanding the Internals of Classes XRay Views: Understanding the Internals of Classes Gabriela Arévalo, Stéphane Ducasse, Oscar Nierstrasz Software Composition Group University of Bern (Switzerland) {arevalo, ducasse, oscar}@iam.unibe.ch

More information