Neural Networks In Mining Abstract-The application of neural networks in the data mining has become wider. Although neural networks may have complex structure, long training time, and uneasily understandable representation of results, neural networks have high acceptance ability for noisy data and high accuracy and are preferable in data mining. In this paper the data mining based on neural networks is disscussed in detail, and the key technology and ways to achieve the data mining based on neural networks are also disscussed. KEYWORDSNeural Network (NN), neural network topology, back propagation algorithm, Advantages, mining, data mining process, implementation. I. INTRODUCTION mining is the term used to describe the process of extracting value from a database. A data-warehouse is a location where information is stored. The type of data stored depends largely on the type of industry and the company. mining tools can forecast the future trends and activities to support the decision of people. For example, through analyzing the whole database system of the company the data mining tools can answer the problems such as Which custom-- -er is most likely to respond to the e-mail marketing activities of our company, why, and other similar problems. Some data mining tools can also resolve some traditional problems which consumed much time, this is because that they can rapidly browse the entire database and find some useful information experts unnoticed. Neural networksrepresent a brain metaphor for information processing. These models are biologically inspired rather than an exact Mohan Vishal Gupta 1, Namit Gupta 2, Sachin Singh 3 Department of Computer Science, CCSIT Teerthanker Mahaveer University, Moradabad 1 mohan.vishal01@gmail.com 2 namit.k.gupta@gmail.com 3 ingh.sachin1986@gmail.com replica of how the brain actually functions. Neural networks have been shown to be very promising systems in many forecasting applications and business classification applications due to their ability to learn from the data, their nonparametric nature (i.e., no rigid assumptions), and their ability to generalize. II. NEURALNETWORK A Neural Network (NN) is a mathematical model or computational model based on biological neural networks, in other words, is an emulation of biological neural system. It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. In most cases an NN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. But its advantages such as high affordability to the noise data and low error rate, the continuously advancing and optimization of various network training algorithms, especially the continuously advancing and improvement of various network pruning algorithms and rules extracting algorithm, make the application of the neural network in the data mining increasingly favored by the overwhelming majority of users. In this paper the data mining based on the neural network is discussed in detail.neural computingrefers to a pattern recognition methodology for machine learning. The resulting model from neural 370
computing is often called an artificial neural network (ANN) or a neural network. Neural networks have been used in many business applications for pattern recognition, forecasting, prediction, and classification. Neural network computing is a key component of any data mining tool kit. A number of advances in technology and business processes have contributed to a growing interest in data mining in both the public and private sectors. Some of these changes include the growth of computer networks, which can be used to connect databases; the development of enhanced search-related techniques such as neural networks and advanced algorithms; the spread of the client/server computing model, allowing users to access centralized data resources from the desktop; and an increased ability to combine data from disparate sources into a single search source. Fig 1:Ariticial Neural Network III. NEURALNETWORKTOPOLOGIES: Feed forward neural network: The feed forward neural network was the first and arguably simplest type of artificial neural network devised. In this network, the information moves in only one direction, forward, from the input nodes, through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the network. The data processing can extend over multiple (layers of) units, but no feedback connections are present, that is, connections extending from outputs of units to inputs of units in the same layer or previous layers. Recurrent network: Recurrent neural networks that do contain feedback connections. Contrary to feed forward networks, recurrent neural networks (RNs) are models with bi-directional data flow. While a feed forward network propagates data linearly from input to output, RNs also propagate data from later processing stages to earlier stages. Training of Artificial Neural Networks: A neural network has to be configured such that the application of a set of inputs produces (either 'direct' or via a relaxation process) the desired set of outputs. Various methods to set the strengths of the connections exist. One way is to set the weights explicitly, using a priori knowledge. Another way is to 'train' the neural network by feeding it teaching patterns and letting it change its weights according to some learning rule. We can categorize the learning situations as follows: Supervised learning or Associative learning in which the network is trained by providing it with input and matching output patterns. These input-output pairs can be provided by an external teacher, or by the system which contains the neural network (self-supervised). Unsupervised learning or Self-organization in which an (output) unit is trained to respond to clusters of pattern within the input. In this paradigm the system is supposed to discover statistically salient features of the input population. Unlike the supervised learning paradigm, there is no a priori set of categories into which the patterns are to be classified; rather the system must develop its own representation of the input stimuli. Reinforcement Learning This type of learning may be considered as an intermediate form of the above two types of learning. Here the learning machine does some action on the environment and gets a feedback response 371
from the environment. The learning system grades its action good (rewarding) or bad (punishable) based on the environmental response and accordingly adjusts its parameters. IV. NEURAL NETWORKS IN DATA MINING There are seven common methods and techniques of data mining which are the methods of statistical analysis, rough set, covering positive and rejecting inverse cases, formula found, fuzzy method, as well as visualization technology. Here, we focus on neural network method. Neural network method is used for classification, clustering, feature mining, prediction and pattern recognition. It imitates the neurons structure of animals, bases on the M-P model and Hebb learning rule, so in essence it is a distributed matrix structure. Through training data mining, the neural network method gradually calculates (including repeated iteration or cumulative calculation) the weights the neural network connected. Fig 2: Mining Process mining is the business of answering questions that you ve not asked yet. mining reaches deep into databases. mining tasks can be classified into two categories: Descriptive and predictive data mining. Descriptive data mining provides information to understand what is happening inside the data without a predetermined idea. Predictive data mining allows the user to submit records with unknown field values, and the system will guess the unknown values based on previous patterns discovered form the database. mining models can be categorized according to the tasks they perform: Classification and Prediction, Clustering, Association Rules. Classification and prediction is a predictive model, but clustering and association rules are descriptive models. i.classification:the most common action in data mining is classification. It recognizes patterns that describe the group to which an item belongs. It does this by examining existing items that already have been classified and inferring a set of rules. ii. Clustering: Similar to classification is clustering. The major difference being that no groups have been predefined. iii. Prediction: Prediction is the construction and use of a model to assess the class of an unlabeled object or to assess the value or value ranges of a given object is likely to have. iv. Forecasting: The next application is forecasting. This is different from predictions because it estimates the future value of continuous variables based on patterns within the data. Neural networks, depending on the architecture, provide massociations, classifications, clusters, prediction and forecasting to the data mining industry. V. DATAMINING PROCESS BASED ON NEURAL NETWORK mining process can be composed by three main phases: A. preparation, B. mining, C. Expression and interpretation of the results, mining process is the reiteration of the three phases. 372
Sources Integration Selection Target Sources Preprocess Result Knowledge Expression Information Mining Preprocessed Fig.3. General Mining Process Original Preparing Rules Extracting Useful Rules Rules Assessment 373
Fig.4. Mining Process Based On Neural Network preparation: preparation is to define and process the mining data to make it fit specific data mining method. preparation is the first important step in the data mining and plays a decisive role in the entire data mining process. It mainly includes the following four processes: a) cleaning: cleansing is to fill the vacancy value of the data, eliminate the noise data and correct the inconsistencies data in the data. b) option: option is to select the data arrange and row used in this mining. c) preprocessing: preprocessing is to enhanced process the clean data which has been selected. d) expression: expression is to transform the data after preprocessing into the form which can be accepted by the data mining algorithm based on neural network. Rules Extracting There are many methods to extract rules, in which the most commonly used methods are LRE method, black-box method, the method of extracting fuzzy rules, the method of extracting rules from recursive network, the algorithm of binary input and output rules extracting (BIO-RE), partial rules extracting algorithm (Partial-RE) and full rules extracting algorithm (Full-RE). Rules Assessment Although the objective of rules assessment depends on each specific application, but, in general terms, the rules can be assessed in accordance with the following objectives. 1) Find the optimal sequence of extracting rules, making it obtains the best results in the given data set; 2) Test the accuracy of the rules extracted; 3) Detect how much knowledge in the neural network has not been extracted; 4) Detect the inconsistency between the extracted rules and the trained neural network. VI. KEY TECHNIQUES AND APPROACHES OF IMPLEMENTATION Effective Combination of Neural Network and Mining Technology The technology almost uses the original NN software package or transformed from existing NN development tools, the workflow of data mining should be understood in depth, the data model and application interfaces should be described with standardized form, then the two technologies can be effectively integrated and together complete data mining tasks. Therefore, the approach of organically combining the NN and data mining technologies should be found to improve and optimize the data mining technology. Effective Combination of Knowledge Processing and Neural Computation Evaluating whether a data mining implementation algorithm is fine the following indicators and characteristics can be used: (1) Whether high-quality modeling under the circumstances of noise and data half-baked; (2) The model must be understood by users and can be used for decision-making; (3) The model can receive area knowledge (rules enter and extraction) to improve the modeling quality. Existing neural network has 374
high precision in the quality of modeling but low in the latter two indicators. Neural network actually can be seen as a black box for users, the application restrictions makes the classification and prediction process can not be understood by users and directly used for decision-making. For data mining, it not enough to depend on the neural network model providing results because that before important decision-making users need to understand the rationale and justification for the decision-making. Therefore, in the NN data mining knowledge base should be established in order to accede domain knowledge and the knowledge NN learning to the system in the data mining process. That is to say, in the NN data mining, it is necessary to use knowledge method to extract knowledge from the data mining process and realize the inosculation of the knowledge processing and neural network. In addition, in the system an effective decision and explanation mechanism should also be considered to be established to improve the validity and practicability of the NN data mining technology. VII. ADVANTAGES OF NEURAL NETWORKS: 1. High Accuracy: Neural networks are able to approximate complex non-linear mappings 2. Noise Tolerance: Neural networks are very flexible with respect to incomplete, missing and noisy data. 3. Independence from prior assumptions: Neural networks do not make a priori assumptions about the distribution of the data,or the form of interactions between factors. 4. Ease of maintenance: Neural networks can be updated with fresh data, making them useful for dynamic environments. 5. Neural networks can be implemented in parallel hardware 6. When an element of the neural network fails, it can continue without any problem by their parallel nature. The following are examples of where neural networks have been used. Accounting _Identifying tax fraud _Enhancing auditing by finding irregularities Finance _Signature and bank note verification _Risk Management _Foreign exchange rate forecasting _Bankruptcy prediction _Customer credit scoring _Credit card approval and fraud detection _Forecasting economic turning points _Bond rating and trading _Loan approvals _Economic and financial forecasting Marketing _Classification of consumer spending pattern _New product analysis _Identification of customer characteristics _Sale forecasts Human resources _Predicting employee s performance and behavior _Determining personnel resource requirements. VIII. DESIGN PROBLEMS: There are no general methods to determine the optimal number of neurones necessary for solving any problem. It is difficult to select a training data set which fully describes the problem to be solved. SOLUTIONS TO IMPROVE NN PERFORMANCE: Designing Neural Networks using Genetic Algorithms Neuro-Fuzzy Systems 375
IX. CONCLUSION At present, neural network is very suitable for solving the problems of data mining because its characteristics of good robustness, selforganizing adaptive, parallel processing, distributed storage and high degree of fault tolerance. Compared to statistical methods, NN are useful especially when there is no a priori knowledge about the analyzed data. They offer a powerful and distributed computing architecture, with significant learning abilities and they are able to represent highly nonlinear and multivariable relationships. The combination of data mining method and neural network model can greatly improve the efficiency of data mining methods, and it has been widely used. It also will receive more and more attention. X. REFERENCES [1]Agrawal, R., Imielinski, T., Swami, A., base Mining: A Performance Perspective, IEEE Transactions on Knowledge and Engineering, pp. 914-925, December 1993. [2] Berry, J. A., Lindoff, G., Mining Techniques, Wiley Computer Publishing, 1997 (ISBN 0-471-17980-9). [3]Berson, Warehousing, -Mining & OLAP, TMH. [4]Haykin, S., Neural Networks, Prentice Hall International Inc., 1999. [5]Khajanchi, Amit, Artificial Neural Networks: The next intelligence. [6]Zurada J.M., An introduction to artificial neural networks systems, St. Paul: West Publishing (1992). [7] Y. Bengio, J. M. Buhmann, M. Embrechts, and J.M. Zurada. Introduction to the special issue on neural networks for data mining and knowledge discovery. IEEE Trans. Neural Networks. [8] M. W. Craven and J. W. Shavlik.Using neural networks for data mining.future Generation Computer Systems, 13:211 229, 1997. 376