Implementation of Data Mining for Vehicle Theft Detection using Android Application

Similar documents
AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE

Building Data Mining Application for Customer Relationship Management

Data Mining Part 3. Associations Rules

The Transpose Technique to Reduce Number of Transactions of Apriori Algorithm

An Algorithm for Mining Large Sequences in Databases

COMPARISON OF K-MEAN ALGORITHM & APRIORI ALGORITHM AN ANALYSIS

Improved Frequent Pattern Mining Algorithm with Indexing

An Improved Apriori Algorithm for Association Rules

Mining of Web Server Logs using Extended Apriori Algorithm

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R,

Web Page Classification using FP Growth Algorithm Akansha Garg,Computer Science Department Swami Vivekanad Subharti University,Meerut, India

Understanding Rule Behavior through Apriori Algorithm over Social Network Data

Frequent Item Set using Apriori and Map Reduce algorithm: An Application in Inventory Management

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset

INTELLIGENT SUPERMARKET USING APRIORI

Chapter 6: Basic Concepts: Association Rules. Basic Concepts: Frequent Patterns. (absolute) support, or, support. (relative) support, s, is the

APRIORI ALGORITHM FOR MINING FREQUENT ITEMSETS A REVIEW

Association Rule Mining. Introduction 46. Study core 46

INFREQUENT WEIGHTED ITEM SET MINING USING NODE SET BASED ALGORITHM

Comparing the Performance of Frequent Itemsets Mining Algorithms

An Efficient Clustering for Crime Analysis

Improving the Efficiency of Web Usage Mining Using K-Apriori and FP-Growth Algorithm

Tutorial on Association Rule Mining

Discovery of Frequent Itemset and Promising Frequent Itemset Using Incremental Association Rule Mining Over Stream Data Mining

Approaches for Mining Frequent Itemsets and Minimal Association Rules

A Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition

mctrgit International Conference on Advances in Computing and Information Technology

Chapter 4: Mining Frequent Patterns, Associations and Correlations

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)

Association Rule Mining among web pages for Discovering Usage Patterns in Web Log Data L.Mohan 1

Data Mining Course Overview

International Journal of Advance Engineering and Research Development. A Survey on Data Mining Methods and its Applications

Association rules. Marco Saerens (UCL), with Christine Decaestecker (ULB)

IJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: [35] [Rana, 3(12): December, 2014] ISSN:

DISCOVERING ACTIVE AND PROFITABLE PATTERNS WITH RFM (RECENCY, FREQUENCY AND MONETARY) SEQUENTIAL PATTERN MINING A CONSTRAINT BASED APPROACH

Performance Based Study of Association Rule Algorithms On Voter DB

Efficient Algorithm for Frequent Itemset Generation in Big Data

CS570 Introduction to Data Mining

An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach

CHUIs-Concise and Lossless representation of High Utility Itemsets

2. Discovery of Association Rules

CLOSET+:Searching for the Best Strategies for Mining Frequent Closed Itemsets

Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules

IMPLEMENTATION AND COMPARATIVE STUDY OF IMPROVED APRIORI ALGORITHM FOR ASSOCIATION PATTERN MINING

Data Mining Concepts

Pattern Mining. Knowledge Discovery and Data Mining 1. Roman Kern KTI, TU Graz. Roman Kern (KTI, TU Graz) Pattern Mining / 42

Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey

A Hybrid Algorithm Using Apriori Growth and Fp-Split Tree For Web Usage Mining

Data Preprocessing Method of Web Usage Mining for Data Cleaning and Identifying User navigational Pattern

An Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining

Chapter 4 Data Mining A Short Introduction

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA

Adaption of Fast Modified Frequent Pattern Growth approach for frequent item sets mining in Telecommunication Industry

Decision Support Systems

Parallel Popular Crime Pattern Mining in Multidimensional Databases

Mining N-most Interesting Itemsets. Ada Wai-chee Fu Renfrew Wang-wai Kwong Jian Tang. fadafu,

Mining Association Rules using R Environment

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining

A SURVEY ON WEB LOG MINING AND PATTERN PREDICTION

A Review on Mining Top-K High Utility Itemsets without Generating Candidates

Mining Rare Periodic-Frequent Patterns Using Multiple Minimum Supports

DESIGN AND IMPLEMENTATION OF ALGORITHMS FOR FAST DISCOVERY OF ASSOCIATION RULES

Analyzing Working of FP-Growth Algorithm for Frequent Pattern Mining

Lecture notes for April 6, 2005

A Comparative Study of Association Mining Algorithms for Market Basket Analysis

HYPER METHOD BY USE ADVANCE MINING ASSOCIATION RULES ALGORITHM

CHAPTER-5 A HASH BASED APPROACH FOR FREQUENT PATTERN MINING TO IDENTIFY SUSPICIOUS FINANCIAL PATTERNS

Overview of Web Mining Techniques and its Application towards Web

Frequent Itemsets Melange

A Technical Analysis of Market Basket by using Association Rule Mining and Apriori Algorithm

ISSN: (Online) Volume 2, Issue 7, July 2014 International Journal of Advance Research in Computer Science and Management Studies

Information Technology & Electrical Engineering

gspan: Graph-Based Substructure Pattern Mining

International Journal of Advanced Research in Computer Science and Software Engineering

A comparative study of Frequent pattern mining Algorithms: Apriori and FP Growth on Apache Hadoop

Association Rule Mining. Entscheidungsunterstützungssysteme

AN IMPROVED GRAPH BASED METHOD FOR EXTRACTING ASSOCIATION RULES

Survey: Efficent tree based structure for mining frequent pattern from transactional databases

PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets

Keshavamurthy B.N., Mitesh Sharma and Durga Toshniwal

Data Structures. Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali Association Rules: Basic Concepts and Application

Comparative Study of Apriori-variant Algorithms

International Journal of Computer Engineering and Applications, Volume VIII, Issue III, Part I, December 14

Subhash Tatale 1, Sachin Sakhare 2 1,2, Computer Engg.,VIIT/ Pune, India)

Comparison of FP tree and Apriori Algorithm

Survey Paper on Web Usage Mining for Web Personalization

An Effectual Approach to Swelling the Selling Methodology in Market Basket Analysis using FP Growth

EFFICIENT TRANSACTION REDUCTION IN ACTIONABLE PATTERN MINING FOR HIGH VOLUMINOUS DATASETS BASED ON BITMAP AND CLASS LABELS

A Comparative study of CARM and BBT Algorithm for Generation of Association Rules

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395

High Utility Web Access Patterns Mining from Distributed Databases

Pattern Discovery Using Apriori and Ch-Search Algorithm

Temporal Weighted Association Rule Mining for Classification

Apriori Algorithm. 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke

Discovery of Multi-level Association Rules from Primitive Level Frequent Patterns Tree

An efficient Frequent Pattern algorithm for Market Basket Analysis

Chapter 2. Related Work

A Survey on Algorithms for Market Basket Analysis

An Efficient Algorithm for finding high utility itemsets from online sell

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394

Transcription:

Implementation of Data Mining for Vehicle Theft Detection using Android Application Sandesh Sharma 1, Praneetrao Maddili 2, Prajakta Bankar 3, Rahul Kamble 4 and L. A. Deshpande 5 1 Student, Department Of Computer Engineering, Vishwakarma Instititute of Information Technology Pune, 1 sandeshsharma770@gmail.com 2 Student, Department Of Computer Engineering, Vishwakarma Instititute of Information Technology Pune, 2 praneetrao007@gmail.com 3 Student, Department Of Computer Engineering, Vishwakarma Instititute of Information Technology Pune, 3 bankarprajakta5@gmail.com 4 Student, Department Of Computer Engineering, Vishwakarma Instititute of Information Technology Pune, 4 rahul.kamble827@gmail.com 5 Professor, Department Of Computer Engineering, Vishwakarma Instititute of Information Technology Pune, ABSTRACT Vehicle theft is increasing day by day, the task of finding the theft vehicles and the criminals has become tedious and gradually difficult. Traffic police department generally uses the conventional way of manually searching records of criminals, when any suspect is detected. By monitoring the criminal records and database, the prediction for the theft vehicle can be done. We are focused on creating and deploying a collaborative application in Android to build and deploy an application that can help maintaining and tracking traffic related criminal records with the help of sequential pattern data mining as well as clustering and frequent pattern matching data mining techniques in which trends found in similar events from past are used to predict the criminals of vehicles. Keywords Sequential Pattern Mining - Apriori, Frequent Pattern Mining - FP Growth. 1. INTRODUCTION Data mining is the method of extracting the useful information from the database which can be used in analyzing the data that can help in taking some decisions based on the data. Data mining can be used in finding of engaging characteristics and trends that are not explicitly represented in the usual databases. This techniques can play an important role in understanding data and in capturing fundamental relationship among data instances, one such technique is sequential pattern mining. Sequential pattern mining deals with a sequence database consisting of sequences of well-ordered elements or events. In Sequential pattern mining the sequence of data is formed that is then matched with certain patterns that are being generated. Occurrence of certain events in time, e-commerce website traversal, computer networks in routing and something as simple as the spelling of a text string are examples of where VOLUME-1, ISSUE-6, NOVEMBER-2014 COPYRIGHT 2014 IJREST, ALL RIGHT RESERVED 33

the occurrence of sequences may be significant and where the recognition of frequent (entirely or in partial order) subsequences might prove useful. It is for the discovery of these such subsequences that the technology of Sequential pattern mining has arisen. Similarly, in our project work the sequence of data will be formed with which the generated patterns will be matched to analyse the sequence of data occurring frequently. In this project work, the focus is being implied on structuring and using the techniques of data mining to analyze the vehicle Theft cases. Cases of Vehicle theft detection for prediction of criminals include data describing different criminal cases that are solved or to be solved. We propose a framework in the form of android application, a generalized one, to discover different types of patterns in theft data sets effectively. These patterns can then be further used to recognize a variety of interactions among factor of theft and prediction of criminals. 2. DATA MINING IN POLICE ADMINISTRATION Till now police in India is using ICT and computers for electronic data collection and computerization of various processes through its software like CIPA, CCIS, CCTNS and others. These software have helped a lot to the police for searching relevant record of the crime and criminals and other relevant information required time to time and hence, time and efforts are being saved. But with the help of data mining, police can benefited by extracting beneficial trends and patterns and behavior among similar cases for making future planning.[4] By using the Data Mining in the Vehicle Theft cases the data of criminal records can be transformed into the information and then finally into the knowledge using certain Data Mining Algorithms. In today s modern culture Android phones is like a must need for every individual to have. Applying the most intelligent techniques like Data Mining along with the latest technologies like Android phones can be considered as one of the most beneficial Application for the Traffic police department which can spontaneously help them searching the criminal records and analysing the theft data. By analysing this data and by gaining the knowledge the efforts for avoiding the future crimes can be taken. Thus, helping in future planning. 3. TECHNIQUE FOR SEQUENTIAL PATTERN DATA MINING 3.1. Apriori Algorithm Apriori is an algorithm for frequent item set mining and association rule learning over transactional databases. It ensues by finding the individual items, which occur frequently, in the database and encompassing them to item sets, which are although larger but the item sets still appear sufficiently often in the database. These item sets, which occur frequently and discovered by Apriori, can be utilized to determine association rules which highlight general trends in the database: the applications of which can be seen in domains such as market basket analysis. Apriori is intended to operate on databases encompassing transactions like compilation of stuff purchased by customers. Whereas some algorithms are structured to discover association rules in data which consist of no transactions, no timestamps (a common example being the sequencing of DNA). In the Apriori, each transaction is considered to be a set of items. Given a threshold C, the item sets are identified by the Apriori algorithm which are subsets of at least C transactions in the database. The "bottom up" approach is used by Apriori, where frequent subsets are extended one item at a time (a step known as candidate generation), and clusters of candidates are verified against the data. The algorithm ends when no further positive extensions are acknowledged. Apriori uses breadth-first search and a Hash tree structure to count candidate item sets efficiently. Then a candidate item sets of length k is generated from item sets of length K-1. Then it prunes the candidates whose sub pattern are infrequent. According to the downward closure lemma, the candidate set contains all frequent k-length item sets. [2] After that is done, the transaction database is scanned to determine frequent item sets among the candidates. Apriori is one of the efficient algorithm that can be applied in our project work as it is related with finding the frequently occurring data sets. By applying Apriori algorithm on vehicle theft data the frequently occurring data related to some areas and some age group of people, etc. can be analysed to gain the knowledge of areas where the crime is committed more VOLUME-1, ISSUE-6, NOVEMBER-2014 COPYRIGHT 2014 IJREST, ALL RIGHT RESERVED 34

frequently and also the age group of people who are responsible for frequently committing the crime. Following is the pseudo code for the algorithm for a transaction database T, and a support threshold of. The usual set theoretic notation is employed, though it is to be noted that T is a multiset. Is the candidate set for level. Generate () algorithm is assumed to generate the candidate sets from the large item sets of the preceding level, by notifying the downward closure lemma. The data structure that represents candidate set, which is initially assumed to be zero, is accessed by the count[c]. Many details are omitted below, the data structure used for storing the candidate sets, and counting their frequencies is usually the most important part of implementation. information about the frequent patterns. This prefix-tree structure in FP-Growth algorithm is called as frequent pattern tree (FP-tree). FP-Growth algorithm makes use of techniques of Apriori and Tree projection together to find the frequent patterns in the large database. In later studies the FP-Growth algorithm is proved to be most efficient algorithm of Data Mining as it makes use of tree structure which helps in finding frequent patterns more efficiently. This algorithm is an alternate way for finding the frequent item set without using candidate generation which helps in improving the performance. Following is algorithm for FP tree construction: 4. TECHNIQUE FOR FREQUENT PATTERN DATA MINING 4.1. FP Growth Algorithm FP Growth Algorithm is another algorithm that can be applied in our project work of Vehicle Theft Detection. In our project work as well as in Data Mining the task of finding the frequent patterns is very important and tedious job. FP Growth algorithm is one of the algorithm designed for the same purpose of finding the frequent patterns in large database. When there is large set of database having numbers of frequent patterns then this task is quite expensive and difficult. FP-Growth algorithm is efficient and scalable algorithm for mining the frequent patterns by pattern fragment growth which uses the prefix-tree structure for storing the crucial By using this algorithm the FP-Tree is constructed in two scans of database. The first scan selects and sorts the set of frequent patterns and second scan constructs the FP-tree. After constructing the FP-tree it is possible to mine it to find the frequent patterns in the database. The FP-Growth algorithm can be given as follows: VOLUME-1, ISSUE-6, NOVEMBER-2014 COPYRIGHT 2014 IJREST, ALL RIGHT RESERVED 35

4.2 Data Preprocessing In this Project, the theft data is used which consists of various parameters as Name of vehicle owner, city, Name of police station where case is reported, case registration number etc., pre-processing means removing the other unwanted parameters from the dataset. Data pre-processing includes following processes: 4.3.1 Data Cleaning In this stage, a consistent format for the data model is to be developed which will take care of missing data, finding duplicated data, and weeding out of bad data. Finally, the cleaned data will be transformed into a format suitable for data mining. 4.3.2 Data selection At this stage, data relevant to the analysis will be decided on and retrieved from the dataset. 4.4 Data transformation This is also known as data consolidation. It is the stage in which the selected data is transformed into forms appropriate for data mining. When the FP-tree contains a single prefix-path, the complete set of frequent patterns can be generated in three parts: the single prefix-path P, the multipath Q, and their combinations (lines 01 to 03 and 14). The resulting patterns for a single prefix path are the enumerations of its sub paths that have the minimum support (lines 04 to 06). Thereafter, the multipath Q is defined (line 03 or 07) and the resulting patterns from it are processed (lines 08 to 13). Finally, in line 14 the combined results are returned as the frequent patterns found. 4.5 Analyzing data using predictive data mining concepts In this phase the extracted patterns are processed using predictive data mining algorithms which would lead to an improved approximate prediction. From above generated two algorithms the frequent patterns can be found in huge set of database that can be used to carry out data mining to gain the knowledge and take the decisions respectively. 4.1 Data Collection 5. DATA PROCESSING The dataset used for this work is to be collected from Traffic Department, Pune. The dataset would contain the data from last few years. Fig 4.1 Basic workflow of the system. VOLUME-1, ISSUE-6, NOVEMBER-2014 COPYRIGHT 2014 IJREST, ALL RIGHT RESERVED 36

6. CONCLUSION At present, data are being actively collected and stored in large data-warehouses in most of the government and social organizations. It is the necessity of the time to recognize the potential value of these data stores as an information source for making useful decisions. Data mining can cater this requirement of decision support in every area of the enforcement agencies. We have tried to use this effective concept for effectively finding the criminal records and guessing the actual suspect of the crime which has been done. We have used this data mining technique to study criminal cases which can help traffic department to analyze the details like prone areas of crime, age group of people committing crime, etc. and finally maintaining the effective criminal records which can help in any case study. REFERENCES [1] Han J., Kamber M., Data Mining: Concepts & Techniques, Morgan & Kaufmann, 2000. [2] Anita Agustina and Nor Azura Husin Sequential pattern mining on library transaction data, IEEE conference, 2010. doi:978-1-4244-6716-7/10/ 2010 [3] Arthur Pitman and Markus Zanker, Insights from Applying Sequential Pattern Mining to E- commerce Click Stream Data IEEE conference, 2010. doi:10.1109/icdmw.2010.31 [4] Singh Dilbag, Kumar Omda, A Case for Data Mining in Police Administration, International Journal of Computer Applications (0975-8887) Volume 63-No.17, February-2013. VOLUME-1, ISSUE-6, NOVEMBER-2014 COPYRIGHT 2014 IJREST, ALL RIGHT RESERVED 37