ON AUTOMATICALLY DETECTING SIMILAR ANDROID APPS. By Michelle Dowling

Similar documents
DELDroid: Determination & Enforcement of Least Privilege Architecture in AnDroid

MUDABlue: An Automatic Categorization System for Open Source Repositories

Using Semantic Similarity in Crawling-based Web Application Testing. (National Taiwan Univ.)

6.858 Quiz 2 Review. Android Security. Haogang Chen Nov 24, 2014

Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic Algorithms

Detecting Third-Party Libraries in Android Applications with High Precision and Recall

Deprecating the Password: A Progress Report. Dr. Michael B. Jones Identity Standards Architect, Microsoft May 17, 2018

AceDroid: Normalizing Diverse Android

Mining Web Data. Lijun Zhang

Finding Unknown Malice in 10 Seconds: Mass Vetting for New Threats at the Google-Play Scale

From time to time Google changes the way it does things, and old tutorials may not apply to some new procedures.

Syllabus- Java + Android. Java Fundamentals

Help! My Birthday Reminder Wants to Brick My Phone!

Android System Architecture. Android Application Fundamentals. Applications in Android. Apps in the Android OS. Program Model 8/31/2015

Published in A R DIGITECH

Security Philosophy. Humans have difficulty understanding risk

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University

1. GOALS and MOTIVATION

highest cosine coecient [5] are returned. Notice that a query can hit documents without having common terms because the k indexing dimensions indicate

A Framework for Evaluating Mobile App Repackaging Detection Algorithms

Semantic text features from small world graphs

Zymkey App Utils. Generated by Doxygen

Clustering. Bruno Martins. 1 st Semester 2012/2013

Indoor WiFi Localization with a Dense Fingerprint Model

Static analysis for quality mobile applications

WuKong: A Scalable and Accurate Two-Phase Approach to Android App Clone Detection

Parley: Federated Virtual Machines

Handling Missing Values via Decomposition of the Conditioned Set

Android. Lesson 1. Introduction. Android Developer Fundamentals. Android Developer Fundamentals. to Android 1

An Attempt to Identify Weakest and Strongest Queries

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Clustering and Dimensionality Reduction. Stony Brook University CSE545, Fall 2017

A Spectral-based Clustering Algorithm for Categorical Data Using Data Summaries (SCCADDS)

Entity and Knowledge Base-oriented Information Retrieval

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

Detecting Self-Mutating Malware Using Control-Flow Graph Matching

Leveraging the Social Web for Situational Application Development and Business Mashups

CS224W Project: Recommendation System Models in Product Rating Predictions

Contextion: A Framework for Developing Context-Aware Mobile Applications

The State of the Trust Gap in 2015

User Guide for itrust over SMS and itrust over Wi-Fi Direct Installation and Use

Detecting Similar Repositories on GitHub

STEALING PINS VIA MOBILE SENSORS: ACTUAL RISK VERSUS USER PERCEPTION

Where Should the Bugs Be Fixed?

ASCERTAINING THE RELEVANCE MODEL OF A WEB SEARCH-ENGINE BIPIN SURESH

Topic 9: Type Checking

Topic 9: Type Checking

An Exploratory Study on Interface Similarities in Code Clones

Language-Based Security on Android (call for participation) Avik Chaudhuri

Introduction to Information Retrieval

New Programming Paradigms

Searching for Configurations in Clone Evaluation A Replication Study

IGEEKS TECHNOLOGIES. Software Training Division. Academic Live Projects For BE,ME,MCA,BCA and PHD Students

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22

Free for All! Assessing User Data Exposure to Advertising Libraries on Android

Android Camera. Alexander Nelson October 6, University of Arkansas - Department of Computer Science and Computer Engineering

PAPER ON ANDROID ESWAR COLLEGE OF ENGINEERING SUBMITTED BY:

Open mustard seed. Patrick Deegan, Ph.D. ID3

A Comparison of Document Clustering Techniques

Playing with skype. 4knahs

End-to-End Neural Ad-hoc Ranking with Kernel Pooling

On Veracious Search In Unsystematic Networks

Denys Poshyvanyk, William and Mary. Mark Grechanik, Accenture Technology Labs & University of Illinois, Chicago

CSE 494: Information Retrieval, Mining and Integration on the Internet

CORMORANT Continuous risk-aware multi-modal authentication

OWASP German Chapter Stammtisch Initiative/Ruhrpott. Android App Pentest Workshop 101

A Comparison of the Booch Method and Shlaer-Mellor OOA/RD

Software Development Kit for ios and Android

Camera and Intent. Elena Fortini

Hashing with Graphs. Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011

Re-engineering Software Variants into Software Product Line

Feature selection. LING 572 Fei Xia

Information Retrieval. hussein suleman uct cs

Design Pattern: Composite

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Extraction of Semantic Text Portion Related to Anchor Link

An Exploratory Study on the Relation between User Interface Complexity and the Perceived Quality of Android Applications

HUKB at NTCIR-12 IMine-2 task: Utilization of Query Analysis Results and Wikipedia Data for Subtopic Mining

Scaling Techniques in Political Science

Lecture 08. Android Permissions Demystified. Adrienne Porter Felt, Erika Chin, Steve Hanna, Dawn Song, David Wagner. Operating Systems Practical

Intents and Intent Filters

Text Analytics (Text Mining)

Presto 2.0. Using Presto Rules

Ad Hoc Airborne Networking for Telemetry Test and Evaluation

An Oracle White Paper October Oracle Social Cloud Platform Text Analytics

Lecture on Modeling Tools for Clustering & Regression

Building a Recommendation System for EverQuest Landmark s Marketplace

MARS AREA SCHOOL DISTRICT Curriculum TECHNOLOGY EDUCATION

Feature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web

CMSC 330: Organization of Programming Languages

Creating a Lattix Dependency Model The Process

Questions and Answers. Q.1) Which of the following is the most ^aeuroeresource hungry ^aeuroepart of dealing with activities on android?

Information Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes

Exemplar: A Search Engine For Finding Highly Relevant Applications

Leveraging Set Relations in Exact Set Similarity Join

Estimating Mobile Application Energy Consumption Using Program Analysis

A Measurement of Similarity to Identify Identical Code Clones

LABORATORY 1 REVISION

Direct Matrix Factorization and Alignment Refinement: Application to Defect Detection

Text Analytics (Text Mining)

Transcription:

ON AUTOMATICALLY DETECTING SIMILAR ANDROID APPS By Michelle Dowling

Motivation Searching for similar mobile apps is becoming increasingly important Looking for substitute apps Opportunistic code reuse Prototyping Market analysis Finding a application similar to one in development Identifying salient features in successful apps Discovering plagiarism or clones across markets Learning how to use APIs

The Problem Due to the number of mobile apps available, determining similar apps is becoming increasingly harder This is worsened by the complexity of detecting similar apps Detecting high level requirements that match semantically Some app repositories contain poorly functioning projects Matching words in requirement documents with words in descriptions or source code does not mean it is relevant to the requirements High numbers of matches in low-level functionality does not mean high-level functionalities match Market-specific search engines only rely on text descriptions

The Solution: CLANdroid Detects similar Android apps in free app markets or open source repositories Uses a set of semantic anchors to detect similarities between apps Previously used semantic anchors (e.g. identifiers and API calls) Explicit and implicit intents used by the apps User permissions declared in the manifest files Sensors used by the apps Capable of detecting similar apps, even when source code is not available Available as an online tool for detecting similar Android apps in which users can specify options and semantic anchors used

The Concept Uses ideas from CLAN: Similar apps share some semantic anchors (e.g. API calls) Requirements are implemented using combinations of different semantic anchors Some semantic anchors are more descriptive of salient features CLANdroid adds intents, permissions, and sensors to the API semantic anchors

The Math Single Value Decomposition (SVD) is used to create a Term-Document Matrix (TDM) at app level that captures the semantic anchors and their frequencies Using Latent Semantic Indexing (LSI) on each TDM extracts concepts for each semantic anchor A similarity measure (cosine distance) can be computed using these results to determine how similar two apps are

The Workflow 1. Apps are downloaded from Google Play in APK format or from open source repositories 2. Semantic anchors are extracted from the apps statically by converting the APK files into JAR files and extracting information from the source files 3. TDMs are created for each type of semantic anchor for each app (API calls, Sensors, Intents, Permissions, Identifiers) 4. The LSI algorithm is applied to each TDM between both apps. Using the results from the LSI algorithm, the cosine distance is computed between the apps 5. Users can select a target app and semantic anchor in CLANdroid to perform searches and retrieve results based on the computed similarity index

Empirical Study on CLANdroid Goals: Evaluate which semantic anchors are most useful in detecting similar apps Analyze the impact of 3 rd -party libraries and obfuscated code when trying to detect similar apps

Empirical Study on CLANdroid Research Questions: 1. What semantic anchors used in CLANdroid produce better results when compared to the others? Evaluates whether Android-specific semantic anchors outperform API calls 2. How orthogonal are the apps detected by CLANdroid as compared to Google Play? 3. Do 3 rd -party libraries and obfuscated apps impact the accuracy of CLANdroid?

Empirical Study on CLANdroid Answering RQ1 (What semantic anchors used in CLANdroid produce better results when compared to the others?): 12 apps were randomly selected from the pool of 14,450 apps downloaded from Google Play The top 3 similar apps from the same pool that belonged to the same category were determined based on 1 of 6 instances of CLANdroid: (1) API calls, (2) Intents, (3) Permissions, (4) Sensors, (5) Identifiers, (6) API calls + Identifiers 6 versions of a 12-question survey (cross-validation design) were generated which asked 27 participants to rate the similarity of the 3 detected apps to the original app using a 4-point Likert scale To determine whether there were significant differences in the similarity rankings (using the Likert scales) of each CLANdroid instance, they used the Kruskal-Wallis test with post-hoc test procedure for pairwise comparisons

Empirical Study on CLANdroid Answering RQ2 (How orthogonal are the apps detected by CLANdroid as compared to Google Play?): For each of the 14,450 apps, each of the previously mentioned 6 instances of CLANdroid were run to determine a set of similar apps (in any category of app) These rankings of similar apps from each of the CLANdroid instances were compared to the rankings of similar apps listed by Google Play TOP R = The top rank of any application in Google Play's list that appears in the CLANdroid list AVG R = The average rank of the applications from Google Play's list that appear in the CLANdroid list To evaulate the results, they compared the TOP R and AVG R series of the CLANdroid instances with the Kruskal-Wallis test with post-hoc procedure

Empirical Study on CLANdroid Answering RQ3 (Do 3rd-party libraries and obfuscated apps impact the accuracy of CLANdroid?): Repeat of the process in RQ2, except obfuscated code and 3 rd -party libraries are excluded To measure the impact of 3 rd -party libraries and obfuscated code on CLANdroid's results, they only used pairwise comparisons using Mann-Whitney between the values of (i) TOP R and AVG R with and without third party libraries, and (ii) TOP R and AVG R with and without obfuscated code

Empirical Study on CLANdroid Limitations There is no guaranteed symmetry in the lists of similar apps provided by Google Play (i.e., if app A is shown as similar to app B, app B may not be shown as similar to app A) The list of apps produced by Google Play is reduced to only contain apps within the set of 14,450 downloaded apps Between these two points, some apps in Google Play have no similar apps that are within the set of downloaded apps. However, these apps are still included in the study since removing them will lead to other apps having no similar apps to use Strong assumptions that Google Play's lists of similar apps are always right Similarity rankings are determined across categories, meaning searches for apps within categories are still affected by apps outside those categories 649 applications could not have third-party libraries distinguished from the rest of the code, so no information was extracted from the source code of these libraries The apps downloaded from Google Play are not necessarily representative of all apps in Google Play, and thus their conclusions cannot be generalized to all of Google Play The opinions of the participants on the similarity of apps is not representative of all user opinions

Results RQ1 (What semantic anchors used in CLANdroid produce better results when compared to the others?): Using API calls to determine similar apps produces the best results Using identifiers was a close second, followed by permissions, then intents Sensors performed the worst and was determined to be statistically significant

Results RQ2 (How orthogonal are the apps detected by CLANdroid as compared to Google Play?): On average, using API calls and intents result in the worst TOP R ratings; sensors were by far the best, followed by identifiers and permissions Similarly, sensors has the best average AVG R rating and API calls and intents were the worst These results are statistically significant in all cases except when comparing the results of identifiers vs. permissions and intentions vs. API calls combined with intentions On average, TOP R using any semantic anchor was high Only 471 apps had an app from Google Play's list in its top position This number is increased to 1,134 when apps of other categories are removed from CLANdroid's lists (663 apps had an app from a different category ranked higher) Evaluation of the top rated similar apps in Google Play vs. CLANdroid suggests that the apps in Google Play's lists are not necessarily functionally similar; perhaps Google Play uses text descriptions and sensors more to determine app similarity

Results RQ3 (Do 3rd-party libraries and obfuscated apps impact the accuracy of CLANdroid?): The results using permissions doesn't change as the manifest file doesn't change based on 3 rd -party libraries or obfuscated code In terms of AVG R, identifiers are the best approach and API calls are the worst The results of TOP R are similar TOP R and AVG R significantly improve by a large amount when 3 rd -party libraries are excluded AVG R significantly improves by a negligible amount when apps with obfuscated code are excluded TOP R performed worse when apps with obfuscated code are excluded

Related Work AnDarwin extracts semantic vectors from source code methods and compares them, looking for code clones between apps to determine similarity Dstruct decompiles APKs and walks through directories and files to determine directory structure. Trees representing directory structure are used to determine app similiarity Chen et al. detect similar apps by comparing centroids created from dependency graphs at the method level, but only flag pairs of apps as clones or not clones. They evaluated their approach across Android markets, but did not use Google Play Gorla et al. use LDA to determine categories of apps, then cluster apps based on these categories using k-means Desnos used method signatures to detect similar Android apps, where the signatures were composed of string literals, API calls, control flow structures, and exceptions Wang et al. first remove code of 3 rd -party libraries from APKs, then use fingerprints containing API calls to detect repackaged or cloned apps across different markets Shao et al. initially cluster apps using resources (e.g. strings and images) and statistical features, then clusters on structural features Thung et al. also base their work on CLAN, but use the tags for the systems in SourceForge instead of API calls to determine app similarity

Discussion Overall, do you think the paper succeeded in addressing the problem posed?

Discussion What strengths does this paper have?

Discussion What weaknesses does this paper have?

Discussion How useful do you think being able to detect functionally similar apps is? How might the ability to detect functionally similar apps be used?

Discussion What other methods of evaluation might they use to strengthen their argument that CLANdroid provides an effective method for detecting functionally similar apps?

Discussion How might this work be extended in the future?