Neural Networks In Data Mining

Similar documents
Neural Network in Data Mining

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Climate Precipitation Prediction by Neural Network

Neuro-fuzzy, GA-Fuzzy, Neural-Fuzzy-GA: A Data Mining Technique for Optimization

ISSN: (Online) Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies

AMOL MUKUND LONDHE, DR.CHELPA LINGAM

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts

DATA MINING AND WAREHOUSING

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

CS6220: DATA MINING TECHNIQUES

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

Data Mining and Warehousing

turning data into dollars

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace

Basic Data Mining Technique

9. Lecture Neural Networks

Introduction to Data Mining and Data Analytics

Final Exam. Controller, F. Expert Sys.., Solving F. Ineq.} {Hopefield, SVM, Comptetive Learning,

Data Mining Technology Based on Bayesian Network Structure Applied in Learning

Data Mining and Analytics

Machine Learning in Biology

DATA MINING Introductory and Advanced Topics Part I

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra

11/14/2010 Intelligent Systems and Soft Computing 1

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems

Back propagation Algorithm:

II. ARTIFICIAL NEURAL NETWORK

Computers Are Your Future

Management Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT

INTRODUCTION... 2 FEATURES OF DARWIN... 4 SPECIAL FEATURES OF DARWIN LATEST FEATURES OF DARWIN STRENGTHS & LIMITATIONS OF DARWIN...

Neural Network Classifier for Isolated Character Recognition

International Journal of Advance Engineering and Research Development. A Survey on Data Mining Methods and its Applications

Cse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University

Liquefaction Analysis in 3D based on Neural Network Algorithm

Image Compression: An Artificial Neural Network Approach

The Data Mining usage in Production System Management

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

Data Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA.

Topics covered 10/12/2015. Pengantar Teknologi Informasi dan Teknologi Hijau. Suryo Widiantoro, ST, MMSI, M.Com(IS)

Ascertaining Apropos Mining Algorithms for Business Applications

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks

Study on the Application Analysis and Future Development of Data Mining Technology

Data Mining Concepts

Data Mining Classification through Simulation Technique

Use of Artificial Neural Networks to Investigate the Surface Roughness in CNC Milling Machine

CS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016

Simulation of Back Propagation Neural Network for Iris Flower Classification

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani

Multi-Layer Perceptron Network For Handwritting English Character Recoginition

Fingerprint Feature Extraction Based Discrete Cosine Transformation (DCT)

by Prentice Hall

A SURVEY ON DATA MINING TECHNIQUES FOR CLASSIFICATION OF IMAGES

Performance Analysis of Data Mining Classification Techniques

Design and Performance Analysis of and Gate using Synaptic Inputs for Neural Network Application

Parametric Comparisons of Classification Techniques in Data Mining Applications

A Comparative study of Clustering Algorithms using MapReduce in Hadoop

Question Bank. 4) It is the source of information later delivered to data marts.

Neural Network Approach for Automatic Landuse Classification of Satellite Images: One-Against-Rest and Multi-Class Classifiers

1) Give decision trees to represent the following Boolean functions:

Summary. Machine Learning: Introduction. Marcin Sydow

UNIT -1 UNIT -II. Q. 4 Why is entity-relationship modeling technique not suitable for the data warehouse? How is dimensional modeling different?

Knowledge Discovery in Data Bases

Disquisition of a Novel Approach to Enhance Security in Data Mining

Knowledge Engineering and Data Mining. Knowledge engineering has 6 basic phases:

International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2015)

Supervised vs.unsupervised Learning

Argha Roy* Dept. of CSE Netaji Subhash Engg. College West Bengal, India.

What is Data Mining? Data Mining. Data Mining Architecture. Illustrative Applications. Pharmaceutical Industry. Pharmaceutical Industry

Fraud Detection using Machine Learning

arxiv: v1 [physics.data-an] 27 Sep 2007

What is Data Mining? Data Mining. Data Mining Architecture. Illustrative Applications. Pharmaceutical Industry. Pharmaceutical Industry

Logical Rhythm - Class 3. August 27, 2018

CS 4510/9010 Applied Machine Learning

A Brief Introduction to Data Mining

Pattern Classification Algorithms for Face Recognition

A Data Classification Algorithm of Internet of Things Based on Neural Network

ABSTRACT I. INTRODUCTION. Dr. J P Patra 1, Ajay Singh Thakur 2, Amit Jain 2. Professor, Department of CSE SSIPMT, CSVTU, Raipur, Chhattisgarh, India

COMP 465 Special Topics: Data Mining

Customer Clustering using RFM analysis

Applying Supervised Learning

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-"&"3 -"(' ( +-" " " % '.+ % ' -0(+$,

Dr. Qadri Hamarsheh Supervised Learning in Neural Networks (Part 1) learning algorithm Δwkj wkj Theoretically practically

Automatic Modularization of ANNs Using Adaptive Critic Method

Mineral Exploation Using Neural Netowrks

Comparative analysis of data mining methods for predicting credit default probabilities in a retail bank portfolio

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

INVESTIGATING DATA MINING BY ARTIFICIAL NEURAL NETWORK: A CASE OF REAL ESTATE PROPERTY EVALUATION

WHAT TYPE OF NEURAL NETWORK IS IDEAL FOR PREDICTIONS OF SOLAR FLARES?

Using Machine Learning to Optimize Storage Systems

International Journal of Mechatronics, Electrical and Computer Technology

A Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition

Data Mining. Introduction. Piotr Paszek. (Piotr Paszek) Data Mining DM KDD 1 / 44

Transcription:

Neural Networks In Mining Abstract-The application of neural networks in the data mining has become wider. Although neural networks may have complex structure, long training time, and uneasily understandable representation of results, neural networks have high acceptance ability for noisy data and high accuracy and are preferable in data mining. In this paper the data mining based on neural networks is disscussed in detail, and the key technology and ways to achieve the data mining based on neural networks are also disscussed. KEYWORDSNeural Network (NN), neural network topology, back propagation algorithm, Advantages, mining, data mining process, implementation. I. INTRODUCTION mining is the term used to describe the process of extracting value from a database. A data-warehouse is a location where information is stored. The type of data stored depends largely on the type of industry and the company. mining tools can forecast the future trends and activities to support the decision of people. For example, through analyzing the whole database system of the company the data mining tools can answer the problems such as Which custom-- -er is most likely to respond to the e-mail marketing activities of our company, why, and other similar problems. Some data mining tools can also resolve some traditional problems which consumed much time, this is because that they can rapidly browse the entire database and find some useful information experts unnoticed. Neural networksrepresent a brain metaphor for information processing. These models are biologically inspired rather than an exact Mohan Vishal Gupta 1, Namit Gupta 2, Sachin Singh 3 Department of Computer Science, CCSIT Teerthanker Mahaveer University, Moradabad 1 mohan.vishal01@gmail.com 2 namit.k.gupta@gmail.com 3 ingh.sachin1986@gmail.com replica of how the brain actually functions. Neural networks have been shown to be very promising systems in many forecasting applications and business classification applications due to their ability to learn from the data, their nonparametric nature (i.e., no rigid assumptions), and their ability to generalize. II. NEURALNETWORK A Neural Network (NN) is a mathematical model or computational model based on biological neural networks, in other words, is an emulation of biological neural system. It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. In most cases an NN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. But its advantages such as high affordability to the noise data and low error rate, the continuously advancing and optimization of various network training algorithms, especially the continuously advancing and improvement of various network pruning algorithms and rules extracting algorithm, make the application of the neural network in the data mining increasingly favored by the overwhelming majority of users. In this paper the data mining based on the neural network is discussed in detail.neural computingrefers to a pattern recognition methodology for machine learning. The resulting model from neural 370

computing is often called an artificial neural network (ANN) or a neural network. Neural networks have been used in many business applications for pattern recognition, forecasting, prediction, and classification. Neural network computing is a key component of any data mining tool kit. A number of advances in technology and business processes have contributed to a growing interest in data mining in both the public and private sectors. Some of these changes include the growth of computer networks, which can be used to connect databases; the development of enhanced search-related techniques such as neural networks and advanced algorithms; the spread of the client/server computing model, allowing users to access centralized data resources from the desktop; and an increased ability to combine data from disparate sources into a single search source. Fig 1:Ariticial Neural Network III. NEURALNETWORKTOPOLOGIES: Feed forward neural network: The feed forward neural network was the first and arguably simplest type of artificial neural network devised. In this network, the information moves in only one direction, forward, from the input nodes, through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the network. The data processing can extend over multiple (layers of) units, but no feedback connections are present, that is, connections extending from outputs of units to inputs of units in the same layer or previous layers. Recurrent network: Recurrent neural networks that do contain feedback connections. Contrary to feed forward networks, recurrent neural networks (RNs) are models with bi-directional data flow. While a feed forward network propagates data linearly from input to output, RNs also propagate data from later processing stages to earlier stages. Training of Artificial Neural Networks: A neural network has to be configured such that the application of a set of inputs produces (either 'direct' or via a relaxation process) the desired set of outputs. Various methods to set the strengths of the connections exist. One way is to set the weights explicitly, using a priori knowledge. Another way is to 'train' the neural network by feeding it teaching patterns and letting it change its weights according to some learning rule. We can categorize the learning situations as follows: Supervised learning or Associative learning in which the network is trained by providing it with input and matching output patterns. These input-output pairs can be provided by an external teacher, or by the system which contains the neural network (self-supervised). Unsupervised learning or Self-organization in which an (output) unit is trained to respond to clusters of pattern within the input. In this paradigm the system is supposed to discover statistically salient features of the input population. Unlike the supervised learning paradigm, there is no a priori set of categories into which the patterns are to be classified; rather the system must develop its own representation of the input stimuli. Reinforcement Learning This type of learning may be considered as an intermediate form of the above two types of learning. Here the learning machine does some action on the environment and gets a feedback response 371

from the environment. The learning system grades its action good (rewarding) or bad (punishable) based on the environmental response and accordingly adjusts its parameters. IV. NEURAL NETWORKS IN DATA MINING There are seven common methods and techniques of data mining which are the methods of statistical analysis, rough set, covering positive and rejecting inverse cases, formula found, fuzzy method, as well as visualization technology. Here, we focus on neural network method. Neural network method is used for classification, clustering, feature mining, prediction and pattern recognition. It imitates the neurons structure of animals, bases on the M-P model and Hebb learning rule, so in essence it is a distributed matrix structure. Through training data mining, the neural network method gradually calculates (including repeated iteration or cumulative calculation) the weights the neural network connected. Fig 2: Mining Process mining is the business of answering questions that you ve not asked yet. mining reaches deep into databases. mining tasks can be classified into two categories: Descriptive and predictive data mining. Descriptive data mining provides information to understand what is happening inside the data without a predetermined idea. Predictive data mining allows the user to submit records with unknown field values, and the system will guess the unknown values based on previous patterns discovered form the database. mining models can be categorized according to the tasks they perform: Classification and Prediction, Clustering, Association Rules. Classification and prediction is a predictive model, but clustering and association rules are descriptive models. i.classification:the most common action in data mining is classification. It recognizes patterns that describe the group to which an item belongs. It does this by examining existing items that already have been classified and inferring a set of rules. ii. Clustering: Similar to classification is clustering. The major difference being that no groups have been predefined. iii. Prediction: Prediction is the construction and use of a model to assess the class of an unlabeled object or to assess the value or value ranges of a given object is likely to have. iv. Forecasting: The next application is forecasting. This is different from predictions because it estimates the future value of continuous variables based on patterns within the data. Neural networks, depending on the architecture, provide massociations, classifications, clusters, prediction and forecasting to the data mining industry. V. DATAMINING PROCESS BASED ON NEURAL NETWORK mining process can be composed by three main phases: A. preparation, B. mining, C. Expression and interpretation of the results, mining process is the reiteration of the three phases. 372

Sources Integration Selection Target Sources Preprocess Result Knowledge Expression Information Mining Preprocessed Fig.3. General Mining Process Original Preparing Rules Extracting Useful Rules Rules Assessment 373

Fig.4. Mining Process Based On Neural Network preparation: preparation is to define and process the mining data to make it fit specific data mining method. preparation is the first important step in the data mining and plays a decisive role in the entire data mining process. It mainly includes the following four processes: a) cleaning: cleansing is to fill the vacancy value of the data, eliminate the noise data and correct the inconsistencies data in the data. b) option: option is to select the data arrange and row used in this mining. c) preprocessing: preprocessing is to enhanced process the clean data which has been selected. d) expression: expression is to transform the data after preprocessing into the form which can be accepted by the data mining algorithm based on neural network. Rules Extracting There are many methods to extract rules, in which the most commonly used methods are LRE method, black-box method, the method of extracting fuzzy rules, the method of extracting rules from recursive network, the algorithm of binary input and output rules extracting (BIO-RE), partial rules extracting algorithm (Partial-RE) and full rules extracting algorithm (Full-RE). Rules Assessment Although the objective of rules assessment depends on each specific application, but, in general terms, the rules can be assessed in accordance with the following objectives. 1) Find the optimal sequence of extracting rules, making it obtains the best results in the given data set; 2) Test the accuracy of the rules extracted; 3) Detect how much knowledge in the neural network has not been extracted; 4) Detect the inconsistency between the extracted rules and the trained neural network. VI. KEY TECHNIQUES AND APPROACHES OF IMPLEMENTATION Effective Combination of Neural Network and Mining Technology The technology almost uses the original NN software package or transformed from existing NN development tools, the workflow of data mining should be understood in depth, the data model and application interfaces should be described with standardized form, then the two technologies can be effectively integrated and together complete data mining tasks. Therefore, the approach of organically combining the NN and data mining technologies should be found to improve and optimize the data mining technology. Effective Combination of Knowledge Processing and Neural Computation Evaluating whether a data mining implementation algorithm is fine the following indicators and characteristics can be used: (1) Whether high-quality modeling under the circumstances of noise and data half-baked; (2) The model must be understood by users and can be used for decision-making; (3) The model can receive area knowledge (rules enter and extraction) to improve the modeling quality. Existing neural network has 374

high precision in the quality of modeling but low in the latter two indicators. Neural network actually can be seen as a black box for users, the application restrictions makes the classification and prediction process can not be understood by users and directly used for decision-making. For data mining, it not enough to depend on the neural network model providing results because that before important decision-making users need to understand the rationale and justification for the decision-making. Therefore, in the NN data mining knowledge base should be established in order to accede domain knowledge and the knowledge NN learning to the system in the data mining process. That is to say, in the NN data mining, it is necessary to use knowledge method to extract knowledge from the data mining process and realize the inosculation of the knowledge processing and neural network. In addition, in the system an effective decision and explanation mechanism should also be considered to be established to improve the validity and practicability of the NN data mining technology. VII. ADVANTAGES OF NEURAL NETWORKS: 1. High Accuracy: Neural networks are able to approximate complex non-linear mappings 2. Noise Tolerance: Neural networks are very flexible with respect to incomplete, missing and noisy data. 3. Independence from prior assumptions: Neural networks do not make a priori assumptions about the distribution of the data,or the form of interactions between factors. 4. Ease of maintenance: Neural networks can be updated with fresh data, making them useful for dynamic environments. 5. Neural networks can be implemented in parallel hardware 6. When an element of the neural network fails, it can continue without any problem by their parallel nature. The following are examples of where neural networks have been used. Accounting _Identifying tax fraud _Enhancing auditing by finding irregularities Finance _Signature and bank note verification _Risk Management _Foreign exchange rate forecasting _Bankruptcy prediction _Customer credit scoring _Credit card approval and fraud detection _Forecasting economic turning points _Bond rating and trading _Loan approvals _Economic and financial forecasting Marketing _Classification of consumer spending pattern _New product analysis _Identification of customer characteristics _Sale forecasts Human resources _Predicting employee s performance and behavior _Determining personnel resource requirements. VIII. DESIGN PROBLEMS: There are no general methods to determine the optimal number of neurones necessary for solving any problem. It is difficult to select a training data set which fully describes the problem to be solved. SOLUTIONS TO IMPROVE NN PERFORMANCE: Designing Neural Networks using Genetic Algorithms Neuro-Fuzzy Systems 375

IX. CONCLUSION At present, neural network is very suitable for solving the problems of data mining because its characteristics of good robustness, selforganizing adaptive, parallel processing, distributed storage and high degree of fault tolerance. Compared to statistical methods, NN are useful especially when there is no a priori knowledge about the analyzed data. They offer a powerful and distributed computing architecture, with significant learning abilities and they are able to represent highly nonlinear and multivariable relationships. The combination of data mining method and neural network model can greatly improve the efficiency of data mining methods, and it has been widely used. It also will receive more and more attention. X. REFERENCES [1]Agrawal, R., Imielinski, T., Swami, A., base Mining: A Performance Perspective, IEEE Transactions on Knowledge and Engineering, pp. 914-925, December 1993. [2] Berry, J. A., Lindoff, G., Mining Techniques, Wiley Computer Publishing, 1997 (ISBN 0-471-17980-9). [3]Berson, Warehousing, -Mining & OLAP, TMH. [4]Haykin, S., Neural Networks, Prentice Hall International Inc., 1999. [5]Khajanchi, Amit, Artificial Neural Networks: The next intelligence. [6]Zurada J.M., An introduction to artificial neural networks systems, St. Paul: West Publishing (1992). [7] Y. Bengio, J. M. Buhmann, M. Embrechts, and J.M. Zurada. Introduction to the special issue on neural networks for data mining and knowledge discovery. IEEE Trans. Neural Networks. [8] M. W. Craven and J. W. Shavlik.Using neural networks for data mining.future Generation Computer Systems, 13:211 229, 1997. 376