Denys Poshyvanyk, William and Mary. Mark Grechanik, Accenture Technology Labs & University of Illinois, Chicago

Similar documents
Exemplar: A Search Engine For Finding Highly Relevant Applications

Dr. Sushil Garg Professor, Dept. of Computer Science & Applications, College City, India

Enhancing Software Traceability By Automatically Expanding Corpora With Relevant Documentation

Searching, Selecting, and Synthesizing Source Code Components. Collin McMillan. Prairie Village, Kansas

PROCE55 Mobile: Web API App. Web API.

Calcite: Completing Code Completion for Constructors using Crowds

Case Study: Dodging the Pitfalls of Enterprise Ajax Applications

1. Implementation of Inheritance with objects, methods. 2. Implementing Interface in a simple java class. 3. To create java class with polymorphism

Recommending Source Code for Rapid Software Prototypes

Android Application Development using Kotlin

Developing Solutions for Google Cloud Platform (CPD200) Course Agenda

FACULTY OF ENGINEERING B.E. 4/4 (CSE) II Semester (Old) Examination, June Subject : Information Retrieval Systems (Elective III) Estelar

CLAN: Closely related ApplicatioNs

CoDocent: Support API Usage with Code Example and API Documentation

SERVICE-ORIENTED COMPUTING

Copyright

ArcGIS Runtime SDK for Android: Building Apps. Shelly Gill

Another difference is that the kernel includes only the suspend to memory mechanism, and not the suspend to hard disk, which is used on PCs.

Reusing Reused Code II. CODE SUGGESTION ARCHITECTURE. A. Overview

What s New In DFC. Quick Review. Agenda. Quick Review Release 5.3 Q&A Post 5.3 plans. David Folk Product Manager

PROVIDING YOU LOG INFRASTRUCTURE LOG COLLECTION SOLUTIONS TO BUILD A SECURE, FLEXIBLE AND RELIABLE

Sentinel RMS SDK v9.3.0

OWASP Top David Caissy OWASP Los Angeles Chapter July 2017

Lecture 2 Android SDK

Using the Cisco ACE Application Control Engine Application Switches with the Cisco ACE XML Gateway

Java Training Center - Android Application Development

Full-Text Indexing For Heritrix

Minds-on: Android. Session 1

6.1 Understand Relational Database Management Systems

CMSC436: Fall 2013 Week 4 Lab

ATC Android Application Development

SilverCreek The World s Best-Selling SNMP Test Suite

Oracle Database 12c Performance Management and Tuning

PCI DSS and the VNC SDK

20483BC: Programming in C#

GET YOUR APP READY FOR QUICKBOOKS CONNECT

Performance Optimization for Informatica Data Services ( Hotfix 3)

ON AUTOMATICALLY DETECTING SIMILAR ANDROID APPS. By Michelle Dowling

SilverCreek SNMP Test Suite

Testing JDBC Applications Using DataDirect Test for JDBC

ForgeRock Access Management Customization and APIs

A Case Study on the Similarity Between Source Code and Bug Reports Vocabularies

Introduction to Software Engineering: Analysis

Implementing Citrix XenApp 5.0 for Windows Server 2008

Reusability and Adaptability of Interactive Resources in Web-Based Educational Systems. 01/06/2003

Course 40045A: Microsoft SQL Server for Oracle DBAs

RACK: Automatic API Recommendation using Crowdsourced Knowledge

Introduction. Lecture 1. Operating Systems Practical. 5 October 2016

SharePoint 2013 Developer

What Is Voice SEO and Why Should My Site Be Optimized For Voice Search?

PROGRAMMING WITH THE MICROSOFT.NET FRAMEWORK USING MICROSOFT VISUAL STUDIO 2005 Course No. MS4995A 5 Day PREREQUISITES COURSE OUTLINE

Adding a Source Code Searching Capability to Yioop ADDING A SOURCE CODE SEARCHING CAPABILITY TO YIOOP CS297 REPORT

UNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai.

ANDROID DEVELOPMENT. Course Details

Android Studio is google's official IDE(Integrated Development Environment) for Android Developers.

A Comparison of Three Document Clustering Algorithms: TreeCluster, Word Intersection GQF, and Word Intersection Hierarchical Agglomerative Clustering

Assignment 3 ITCS-6010/8010: Cloud Computing for Data Analysis

Automating Document Imports with Nolij Auto Import and Import

Sun Mgt Bonus Lab 6: Migration to App-ID Security Policy

1. GOALS and MOTIVATION

Comment-based Keyword Programming

1. Data Model, Categories, Schemas and Instances. Outline

A Data Modeling Process. Determining System Requirements. Planning the Project. Specifying Relationships. Specifying Entities

Java EE 6: Develop Business Components with JMS & EJBs

Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points?

Assieme: Finding and Leveraging Implicit References in a Web Search Interface for Programmers

Android Online Training

Development of Contents Management System Based on Light-Weight Ontology

Distributed Indexing of the Web Using Migrating Crawlers

Installation Instructions

Closure Compiler: Speeding Web Appications by Compiling JavaScript

Google Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda

SOURCERER: MINING AND SEARCHING INTERNET- SCALE SOFTWARE REPOSITORIES

TYPE ADOPTION IN xcp APPLICATIONS

Administrator s Guide

9 th CA 2E/CA Plex Worldwide Developer Conference 1

Mission Guide: Google Apps

Excel4apps Wands 5 Architecture Excel4apps Inc.

BMC Remedyforce Discovery and Client Management. Frequently asked questions

ANNUAL REPORT Visit us at project.eu Supported by. Mission

MUDABlue: An Automatic Categorization System for Open Source Repositories

Android Basics Nanodegree Syllabus

Programming Web Services in Java

Android. Lesson 1. Introduction. Android Developer Fundamentals. Android Developer Fundamentals. to Android 1

FLAT 3 : Feature Location & Textual Tracing Tool

Service Mesh and Microservices Networking

Query Result Extraction Using Dynamic Query Forms

Required Core Java for Android application development

ClearPass Policy Manager. Configuration API Guide

Review of Lone Star Software Symposium: NFJS Peter Donton

The C# Programming Language. Overview

Android App Development

Chapter 2. Architecture of a Search Engine

Information Retrieval II

Introduction to Spring 5, Spring MVC and Spring REST

The Corticon Rule Modeling Methodology. Applied to. FEMA Disaster Assistance Fraud Detection. A Case Study

Single Sign-On Best Practices

First Research. User s Guide

Introducing the Harmony Core Open Source Project Presented by Jeff Greene

BSCI Word Merge ipart User Notes - imis xx

Transcription:

Denys Poshyvanyk, William and Mary Mark Grechanik, Accenture Technology Labs & University of Illinois, Chicago

How Many Open Source Applications Are There? Sourceforge.net reports that they host 180,000 projects as of August 1, 2008. There are dozens of other open source repositories containing tens of thousands of different applications. Companies have internal source control management systems containing hundreds of thousands of applications.

Problem Finding/checking existing software matching high-level user requirements Would reduce the cost of many software projects Would provide users with examples of different implementations Challenges: Finding relevant applications is difficult Evaluating retrieved applications is difficult

What Search Engines Do encrypt compress XML data Matching keywords descriptions of apps

What Search Engines Do encrypt compress XML data Matching keywords descriptions of apps

Fundamental Problems Vocabulary problem Mismatch between the high-level intent reflected in the descriptions of applications and their low-level implementation details Concept assignment problem High level concept Send data Code snippet implementing Send data s = socket.socket(proto, socket.sock_dgram) s.sendto(teststring, addr) buf = data = receive(s, 100) while data and '\n' not in buf: data = receive(s, 100) buf += data Many application repositories are polluted with poorly functioning projects

Working without a Tool Find relevant application(s) Download application Locate and examine fragments of the code that implement the desired features Observe the runtime behavior of this application to ensure that this behavior matches requirements This process is manual since programmers: study the source code of the retrieved applications locate various API calls read information about these calls in help documents Still, it is difficult for programmers to link high -level concepts from requirements to their implementations in source code

Our Goal encrypt compress XML data search

encrypt compress XML data Our Goal

Key observations While studying retrieved apps developers : locate various API calls read information about these calls in help documents Help docs are supplied by the same vendors whose packages/apis are used in software Programmers read and rely on these API docs Help docs are written and reviewed by many developers Help documents are usually more verbose and accurate than project descriptions

How S 3 System Works API call 1 User query match descriptions of API calls API call 2 API call 3 Automatically matching words in user queries against API help docs instead of: searching in project descriptions; searching in source code. S 3 uses help documents to produce a list of relevant API calls

S 3 Architecture

Current Status Restricting the scope to Java projects Challenges: How to automatically locate and download the latest version of the software (e.g., from sourceforge)? How to automatically locate the correct entry point (i.e., main) for static analysis? How to reduce the time for the static analysis? How and when to update API call dictionary? Testing other ranking heuristics Evaluation is pending (some preliminary results at the poster session)

Related Work CodeFinder/Helgon ParseWeb CodeBroker Hipikat Automated Method Completion (AMC) Strathcona Prospector XSnippet Google code search, Krugle,

Conclusions & Future Work S 3 recommends/checks relevant applications based on: analysis of relevant API help documents; analysis of actual API calls. Indexing available open-source projects and pre-computing data and control flow among API calls Analyzing multiple releases of the same project