International Journal of Advance Engineering and Research Development. A Survey on Data Mining Methods and its Applications

Similar documents
ISSN: (Online) Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies

Parametric Comparisons of Classification Techniques in Data Mining Applications

Study on the Application Analysis and Future Development of Data Mining Technology

COMP 465 Special Topics: Data Mining

Introduction to Data Mining and Data Analytics

Question Bank. 4) It is the source of information later delivered to data marts.

Correlation Based Feature Selection with Irrelevant Feature Removal

Classifying Twitter Data in Multiple Classes Based On Sentiment Class Labels

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts

IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 05, 2016 ISSN (online):

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani

Data Mining Technology Based on Bayesian Network Structure Applied in Learning

An Improved Document Clustering Approach Using Weighted K-Means Algorithm

Iteration Reduction K Means Clustering Algorithm

Data Mining. Introduction. Piotr Paszek. (Piotr Paszek) Data Mining DM KDD 1 / 44

D B M G Data Base and Data Mining Group of Politecnico di Torino

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

Discovery of Agricultural Patterns Using Parallel Hybrid Clustering Paradigm

International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16

Applications and Trends in Data Mining

A Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA)

Recommendation on the Web Search by Using Co-Occurrence

CS423: Data Mining. Introduction. Jakramate Bootkrajang. Department of Computer Science Chiang Mai University

Data Mining Course Overview

Data mining overview. Data Mining. Data mining overview. Data mining overview. Data mining overview. Data mining overview 3/24/2014

TIM 50 - Business Information Systems

SK International Journal of Multidisciplinary Research Hub Research Article / Survey Paper / Case Study Published By: SK Publisher

Customer Clustering using RFM analysis

IJMIE Volume 2, Issue 9 ISSN:

Knowledge Discovery in Data Bases

Chapter 1, Introduction

Research on Data Mining and Statistical Analysis Xiaoyao Lu1, a

DATA MINING AND ITS TECHNIQUE: AN OVERVIEW

Knowledge Discovery and Data Mining

SOCIAL MEDIA MINING. Data Mining Essentials

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Data mining fundamentals

A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES

ANALYSIS COMPUTER SCIENCE Discovery Science, Volume 9, Number 20, April 3, Comparative Study of Classification Algorithms Using Data Mining

Knowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA

An Efficient Clustering for Crime Analysis

Research Article International Journals of Advanced Research in Computer Science and Software Engineering ISSN: X (Volume-7, Issue-6)

Dynamic Data in terms of Data Mining Streams

Data Mining Concepts

International Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at

A SURVEY OF DATA MINING & ITS APPLICATIONS

Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies

A Review: Content Base Image Mining Technique for Image Retrieval Using Hybrid Clustering

Introduction to Data Mining

Introduction to Data Mining S L I D E S B Y : S H R E E J A S W A L

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

Review on Text Mining

Gurpreet Kaur 1, Naveen Aggarwal 2 1,2

Clustering of Data with Mixed Attributes based on Unified Similarity Metric

The Transpose Technique to Reduce Number of Transactions of Apriori Algorithm

COMPARISON OF DIFFERENT CLASSIFICATION TECHNIQUES

Management Information Systems Review Questions. Chapter 6 Foundations of Business Intelligence: Databases and Information Management

Web Data mining-a Research area in Web usage mining

Implementation of Data Mining for Vehicle Theft Detection using Android Application

Overview of Web Mining Techniques and its Application towards Web

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace

Lies, Damned Lies and Statistics Using Data Mining Techniques to Find the True Facts.

Chapter 5: Summary and Conclusion CHAPTER 5 SUMMARY AND CONCLUSION. Chapter 1: Introduction

Web Mining Evolution & Comparative Study with Data Mining

Knowledge Discovery. URL - Spring 2018 CS - MIA 1/22

International Journal of Software and Web Sciences (IJSWS)

PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a

Analysis of Fragment Mining on Indian Financial Market

Data Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA.

KNOWLEDGE DISCOVERY AND DATA MINING

K-Mean Clustering Algorithm Implemented To E-Banking

Analyzing Outlier Detection Techniques with Hybrid Method

International Journal of Computer Engineering and Applications, Volume XI, Issue IX, August 17, ISSN

Clustering Analysis based on Data Mining Applications Xuedong Fan

Dynamic Clustering of Data with Modified K-Means Algorithm

Data Mining and Data Warehousing Introduction to Data Mining

Web Mining Using Cloud Computing Technology

Record Linkage using Probabilistic Methods and Data Mining Techniques

Computers Are Your Future

WEB USAGE MINING: ANALYSIS DENSITY-BASED SPATIAL CLUSTERING OF APPLICATIONS WITH NOISE ALGORITHM

Selection of n in K-Means Algorithm

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

Application of Clustering as a Data Mining Tool in Bp systolic diastolic

A SURVEY ON DATA MINING TECHNIQUES FOR CLASSIFICATION OF IMAGES

IMPLEMENTATION OF CLASSIFICATION ALGORITHMS USING WEKA NAÏVE BAYES CLASSIFIER

A Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition

Domestic electricity consumption analysis using data mining techniques

Reliable Data Mining Tasks and Techniques for Industrial Applications

Frequent Item Set using Apriori and Map Reduce algorithm: An Application in Inventory Management

International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July-2013 ISSN

9. Conclusions. 9.1 Definition KDD

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Index Terms Data Mining, Classification, Rapid Miner. Fig.1. RapidMiner User Interface

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Data mining techniques for actuaries: an overview

DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE

Data Mining. Yi-Cheng Chen ( 陳以錚 ) Dept. of Computer Science & Information Engineering, Tamkang University

Transcription:

Scientific Journal of Impact Factor (SJIF): 4.72 International Journal of Advance Engineering and Research Development Volume 5, Issue 01, January -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 A Survey on Data Mining Methods and its Applications R.Deepa 1, S.Vaishnavi 2 Assistant Professor, Department of Information Technology, PSG College of Arts and Science, Coimbatore, Tamil Nadu,India 1 Research Scholar, Department of Information Technology, PSG College of Arts and Science, Coimbatore, Tamil Nadu, India 2 ABSTRACT:- Data mining is the process used to convert raw data into useful data using software. It is also the process of selecting, searching, analyzing database for patterns and recent trends in big amount of data. It describes hidden patterns and relationships helpful in decision making. Data mining is approach for extracting of information from vast database to produce a new instruction that provides to define further tract in knowledge discovery in database. This paper explains the survey of data mining methods such as clustering, regression, classification and summarization. This paper also tells about applications and methods in data mining. 1. INTRODUCTION Data mining technique that combines data analysis methods with advance algorithm for processing large amount of data. Data mining has the option for exploring and analyzing new type of data and old type of data in new way. Data mining analyses and summarizes the knowledge or data from different perspectives into effective information. Data mining is relating phase of knowledge discovery database (KDD). It abstracts from the hidden information similar and needful knowledge. Data mining is the technically way that finds patterns and correlation among number of field in huge database. This type of research that manipulates and stores decision making process and future prediction. 2. DATAMINING PROCESS AND STEPS Data mining is the process in which provides a technique to find vital values and convert information into knowledge from hidden information we can abstract similar wanted knowledge. It identifies source information it takes data points and analyzes the needful and extracting related knowledge, that find the key values for extracted data and reporting the results. So in data mining is known KDD (knowledge Discovery in Database). The process of KDD is to significance the identify goals and understandable pattern in data. In Data Mining, knowledge extraction seven steps occurs in Fig 1. Figure 1 Data Mining Process 1. Data cleaning: This step is to remove noise data and inconsistent or irrelevant data from volume of data. 2. Data integration: First process is to collect the data, in this step we are integral from all the different sources. We combine n number of data sources into single data store called as target data. 3. Data selection: Data relevant is to analysis the task room the retrieved database from numerous resources. @IJAERD-2018, All rights Reserved 463

4. Data transformation: The data is converted into required form of mining as we transform to appropriate for mining and techniques used to accomplish the functions are done such as smoothing, aggregation, normalization. 5. Data mining: This approach and tools used in this step that discovers data pattern and principles. 6. Pattern evaluation: This step, Patterns represents knowledge that are identified established on given estimate. 7. Knowledge representation: This is final step represents the knowledge technique which is used for the users helps to perceive and view the knowledge in data mining and outcome. The aim of knowledge discovers and data mining process is to discover the patterns that are not known among the large set of data to interpret needful information and knowledge. 3. DATAMINING METHODS Data mining process extracts the information for large volume of data sets and converts into predictable for future use. So it helps in achieving defined objectives. The data mining aims normally creates predictive and descriptive model [7]. A Predictive model turns worthy data giving sufficient information this uses data to determine headed future output of an event occurs it has many types of techniques of statistical modeling, machine learning, data mining and game theory that analysis current and historical facts to do predictive future. A Descriptive model concentrates data and past analysis events for instance how to future approach. Descriptive looks at historical performances by mining past data and looks for past success and failure. The Predictive and Descriptive models are represented fig 2. Fig 2.Predictive and Descriptive Models 1. Classification: Classification assigns new records to (predefined classes) advance establishment of classes, it uses to predictive the future outcome that is true or false describes range high or low. This is most familiar technique classification performs and assumes so data knowledge used to classify neural network, logistic regression, decision tree algorithm. Classification technique is use in segmenting of customer, business modeling and various applications. E.g. prediction of future employee records. 2. : It is an another predictive data modeling type it known as supervised learning technique. This is analyses is dependent on another attributes mainly that presents in same item regression knows the target value. models are computed by many statistical measures and the difference between predicted and expected values. E.g. prediction of house value and number of rooms, size etc. 3. Time series data analysis: Time series database is one of the popular data type it is used in many fields such as meteorological, business and medical. The recent change of this analysis is collecting in spot patterns. It can be analyzed to identify correlations. For E.g. Stock market that predict future values and defines similar patterns over time. 4. Prediction: Prediction calculates the rough value. The future is to predict a future than the current state. It defines the relationship between the dependent and independent variables. E.g sales and profit. It includes the fore warming of disasters occurs naturally(flooding, snowstorm). 5. Clustering: Clustering is the collection of similar group of objects in data. Another form of cluster is dissimilar of objects. Clustering is used in many applications mainly in research marketing, recognition of pattern and image @IJAERD-2018, All rights Reserved 464

processing. Clustering application is to further discover marketing to help marketers and use to develop targeted marketing functions. We can find closed and scattered regions in object space. For effective means of classification approach is used of distinguishing groups. This method is expensive, so it can be used as preprocessing approach. Used in for E.g astronomy, galaxies, studies about earthquake detection. 6. Summarization: Data summarization summaries calculated data that includes earlier stage and obtained stage. In case to make denied evaluation data that is generalized in nature. The data in data warehouse is very high amount of space. Data summarization is large multi dimensional datasets as the case of data warehouse is very challenging work. Summarization simplifies methods of tabulating the mean and standard deviation which is applied in data analysis, variation and generating report. 7. Association: Association technique is used to find frequent patterns, correlates from datasets. Association rule is mostly used to express sales. Association rule mining is used to analyzing customer behavior and predict the customer behavior in supermarket. This rule given in a set of transaction that finds rule activate us to estimate the future occurrence of selected item based on the occurrence of other items in the current process of transaction. This plays the major role in shopping analysis of data, clustering of product and designs of catalog. 8. Sequence Discovery: it finds the relation between one or more data. The set of object that is connected with its own graphical representation of time period, which marks that important event. For E.g. scientific experimentation and natural disaster and analysis the sequence of data. 1. Tabular Column of Methods Used in Data Mining Classification Neural Network Nearest Neighbor Classifier Prediction Artificial Intelligence Social Media Monitoring Customer Support and Chatbot Decision oblique tree Treedecision Linear Logistic Polynomial Stepwise Ridge Time Series Data Analysis Methods used in forecasting Exponential Smoothing Method V-Croston Clustering Summarization Connectivity Centroid Distribution Density Cluster Based Method Graph Theoretic Approach Latent Semantic Analysis Automatic Text Summarization Association Rule Based Machine Learning Frequent Item set Generation Rule Generation Computational Complexity Sequence Discovery Mining Sequence Patterns Finding Sequence Patterns Constraint Based Sequence Pattern Mining @IJAERD-2018, All rights Reserved 465

4. DATAMINING APPLICATIONS Data mining application is used by many types of large and small scale companies and industries with stronger concentration in financial and retail communication and groups into access data which has lower end level of hierarchically structured database their transactional data and determine the rate of customer priority and arranging of product. In data mining a retailer can use point out with records of customer sales and purchase to develop and promote specifically to the customers. 1. Healthcare and Medical: Data mining handles great and popular to higher the improvement in health systems it uses the data analytics that identifies good practices that improves care and reduce cost. A researcher uses data mining to detect and survey different kind of diseases and takes future measurements. 2. Education: Education data mining that mines knowledge discovering from originating the data it helps in future outcomes. To computerize the syllabus which is needy for practical. It provides E-learning it fetches from database. Data mining accurate the result of student database. EDM learns to rise the technology to teach in different methods. 3. Market Basket Analysis: Data mining provides hidden patterns in business that helps in planning and releasing of marketing in cost efficient that allows the seller and buyer to understand the business the makes profitable change and customer satisfaction. It uses transaction of credit and debit card and hidden correlation. 4. Finance Banking: Banking uses to identify the customers to analyze the purchase, retain and stock trading rules. The data mining acquires in targeting and maintaining the profitable customers. It helps and improves in customer relationship management. 5. Agriculture: Agriculture in data mining emerges the technology in for crop yielding bases of occurrence of rainfall, production and seasonable changes. Data mining techniques such as k means and Artificial Neural Network. 6. Cloud Computing: Data mining that implements and utilizes programming, application of software managing of data storage.the huge volume of business data can be stored in information center in low estimation. It enables and performs that is efficient, reliable and secured and low cost efficiency for users. 7. Transportation: Data mining determines and distributes the schedule from warehouse and analyses the patterns. 8. Engineering in Manufacturing: Data mining tools is very useful in discovery patterns in manufacturing process. DM works in system level to design and extract the relationship of architecture product and product portfolio. It predicts the development of span and cost of the product. 9. Lie Detection: Lie detection is easy to bring the truth. Mining Technique used to detect investigate the crimes it includes text mining too. The sample data which collected from previous investigators that compares model of lie detection which is created. Law enforcement can use mining techniques for investigating of crime and monitoring suspected terrorists. 10. E- Commerce: E-commerce Company s uses data mining to check the range of selling customers that delivers number of products according to customer view. It used to sell and shows which preferred and update the trends. It helps to buy and sell faster and delivers on time. It checks for low operational and best quality. 11. Bio Informatics: this technology used to manage biological information bio informatics has the science of storing, extracting, analyzing, interpreting and utilizing the information of sequence and models. It is uses recent trend of biology genomics and research biomedical. This application includes discovery of structural patterns analysis of genetic networks. 12. Financial Data Analysis: This facility is systematic data analysis and mining in banking and industries in finance is reliable and systematic it designs and builds of data warehouse for multi dimensional data analysis and mining it helps customer in loan payment and credit policy. 5. CONCLUSION Data mining is powerful and useful technique that manipulates data performance and gives exact and target output from large amount of data that grows world wide it helps the experts to validate, analysis of various application results. Data mining is purposely used for present and future domain. It provides high security, maintenance in business portfolio. This paper discusses about data mining ideas, KKD process and various techniques. We discussed some vision of data mining application. @IJAERD-2018, All rights Reserved 466

REFERENCES 1. Aarti Sharma et al, Application of Data Mining A Survey Paper,International Journal of Computer Science and Information technologies, Vol,5(2), 2014. 2. Smita, Priti and Sharma, Use of Data Mining in Various Field: A Survey Paper IOSR Journal of Computer Engineering,8727 Volume 16, Issue 3, Ver.V(May June 2014) 3. Data Mining: Concepts and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management System s ) 3 rd Edition. 4. Brijesh Kumar Baradwaj, Saurabh Pal Mining Educational Data to Analyze Students Performance (IJACSA) International Journal of Advanced Computer Science and Applications, Vol.2, No 6, 2011. 5. J. Han and M.Kamber, Data Mining, Concepts and Techniques, Morgan kauffmann, 2000. 6. Nikitha Jain, Vishal Srivastava DATA MINING TECHNIQUES: A SURVEY PAPER IJRET: International Journal of Research in Engineering and Technology, Volume:02 Issue: 11 Nov-2013. 7. Pradnya P.SondWale, Overview of Predictive and Descriptive Data Mining Techniques IJARCSSE, Volume 5, Issue 4, April 2015. 8. Prof.Dr.Wofgang Karl Hardle, Time Series Data Mining Techiques IJARCSSSE, Volume 5, Issue 4, April 2015. @IJAERD-2018, All rights Reserved 467