ADVANCES in NATURAL and APPLIED SCIENCES ISSN: 1995-0772 Published BYAENSI Publication EISSN: 1998-1090 http://www.aensiweb.com/anas 2017 February 11(2): pages 14-18 Open Access Journal A Novel Framework for Anomaly Detection and Prediction of significant signs of changing climate events using Machine learning techniques 1 Dr. P. Vaishnavi, 2 G. Palanivel, 2 Dr. K. Duraiswamy 1 Asst. Professor, Dept. of Computer Applications, Anna University, BIT Campus, Trichy. 2 Research Scholar, Anna University, Chennai. 2 Dean, K. S. Rengasamy college of Engineering and technology, Tiruchengode. Received 18 December 2016; Accepted 12 February 2017; Available online 20 February 2017 Address For Correspondence: Dr. P. Vaishnavi, Asst. Professor, Dept. of Computer Applications, Anna University, BIT Campus, Trichy. Copyright 2017 by authors and American-Eurasian Network for Scientific Information (AENSI Publication). This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/ ABSTRACT Demand on changing in extreme weather and climate events has lead to significant impact and challenges to the society. As a result of the extreme data analysis on weather data sets, expectations on information service of changing pattern on climate are growing terrifically. To solve this problem, requires intensive knowledge extraction on the behavior of extreme changes in climate anomalies. The most widely used technique to analyze huge volume of data sets is data mining technique to identify the hidden insight about the behavior, constraint, and pattern etc. In data mining, anomaly detection is identification of critical problem in every fortune of networking especially in real time data analysis. This research work proposes a novel framework for weather data analysis, especially for taking appropriate decision on policy of change in climate for a state. The anomaly detection process is performed as follows: Initially weather data sets are cleaned and further it is used for identifying the anomalies. The anomaly detection finds the outliers in the real valued data set to detect significant change in the climate events. The key issue in finding the outlier is the false positive values in the real time data sets which lead to the consequences of unexpected pattern of results/ or negative results. We process a novel framework, which uses machine learning technique to predict the critical information of unexpected results of climate changing events for a region in the state. KEYWORDS: Weather data sets, Data mining techniques, Anomaly Detection, Machine learning technique, predictive analytics. INTRODUCTION Anomaly detection refers to find the unexpected hidden patterns that do not match the normal flow of behavior. The above unexpected patterns are referred as anomalies, outliers, discordant observations, peculiarities; aberration etc [4] Anomaly detection system supervises the character of the system and also flags the appropriate deviations from the usual activity as an anomaly [2]. The approach is based on statistical technique for analysis of real time data sets to detect anomalies, deviations in the data sets. They are used to find the attack in data sets in the network and also find the intruder activities in the system. This detection system usually implements signature based detection, but it is unfit to identify the new type of attacks [2]. In this research paper, we proposes anomaly detection along with machine learning technique that includes classification, clustering and neural network based algorithm to identify the deviations in the real time data sets. The main challenge of this work is to track the deviations in weather data sets based on the seasonal implications. The algorithm adopted is used to train the model for real data for obtaining the pattern in the climate events. Approaches of Anomaly detection are as follows, supervised approach is best for known attacks. Unsupervised approach is effective for known and unknown attack. Clustering approach is also used in To Cite This Article: Dr.P. Vaishnavi, G. Palanivel, Dr. K. Duraiswamy., A Novel Framework for Anomaly Detection and Prediction of significant signs of changing climate events using Machine learning techniques. Advances in Natural and Applied Sciences. 11(2);Pages: 14-18
15 Dr.P. Vaishnavi, et al., 2017/Advances in Natural and Applied Sciences. 11(2) February 2017, Pages: 14-18 unsupervised approach [4]. The machine learning approach is a specific area of specialization in computer science which adapts the mining technique to automate the recognition of patterns from the specific data sets [10]. In this paper we had worked on predicting the anomaly in the real time rainfall data sets and to predict the significant signs of changing climate events for a specific region in the state. The novel frame work proposed was designed to forecast the changing climate problems for a decade weather data sets. In future this work will be enhanced to study the impact of changing climate events in the state. This paper is organized as below. In the section 2, we discuss the literature review of detection system. Section 3 describes the intrusion detection and its type. Section 4 describes the anomaly detection technique. Section 5 describes the proposed framework. Section 6 describes the Result and discussion. Section 7 describes the conclusion and future work Literature Review: Anomaly detection has become an important area of intensive research for secured communication. Many authors have suggested various approaches for unsupervised anomaly intrusion detection with artificial neural networks. In a framework that combined neural network with K-means clustering for the detection of real time anomalies, Seungmin Lee et al. [3] have reported that new attacks can also be detected in an intelligent way. The algorithm is reported to be dynamically adaptive with increased detection rate while keeping the false alarm rate to the minimum. Adebayo O et al [8] have used two machine learning techniques namely Rough Set (LEM2) algorithm and k-nearest neighbor (knn) algorithm for intrusion detection. However, poor detection rate of these algorithms on U2R and R2L attacks has been attributed to the few representations in the training dataset. But the attribute values in a training data set are completely different from the attribute values of the test dataset for these two attack types. Ozgur Depren et al [9] have designed a model for both misuse and anomaly intrusion detection by employing SOM for detecting anomalies only with important but limited number of features. The model has been based only on normal behavioral patterns and any deviation from the normal is considered as an attack. Zhi-song pan et al., have reported a misuse intrusion detection model based on a hybrid neural network and decision tree algorithm. They have discussed the advantages of different classification abilities of neural networks and the C4.5 algorithm for different attacks. While neural network algorithm is reported to have high performance to DOS and Probe attacks, the C4.5 algorithm has been found detect R2L and U2R attacks more accurately. Intrusion Detection System: The immunity-based agents roam around the machines (nodes or routers) and verify the situation in the network (i.e. look for changes such as malfunctions, faults, abnormalities, misuse, deviations, intrusions, etc.). These agents can mutually identify each other's activities and can take appropriate actions according to the underlying security policies. Specifically, their activities are synchronized in a hierarchical fashion while sensing, communicating and generating responses. Such an agent can learn and adapt to its environment dynamically and can detect both known and unknown intrusions. This research is the part of an effort to expand a multi-agent detection system that can simultaneously monitor networked computer's activities at dissimilar levels (such as user level, system level, process level and packet level) in order to decide intrusions and anomalies in the data sets. In addition to anomaly, false positive are the major critical problem, since most of the alerts are not real [13]. This intrusion detection system will give the user, better monitoring the network environment and provide an additional tool to make the computing systems secure. Types of Intrusion Detection System: 1. Network Intrusion System: An intrusion detection system is used to enhance the security of networks by inspecting all inbound and outbound network activities and by classifying suspicious patterns as possible intrusions [2]. Nowadays researchers focus on applying outlier detection techniques for anomaly detection because of its promising results in classifying true attacks and in reducing false alarm rate [10]. 2. Host Based Intrusion System: This system also audit the information includes identification and authentication mechanism [2]. It is used to find the way of intrusion. This system is used to monitor the behavior of network packets and also traffic in host [10]. 3. Hybrid Intrusion System: In this system, it provides a option to manage both Network Intrusion system and Host based Intrusion system using Central Intrusion Detection system[10].
16 Dr.P. Vaishnavi, et al., 2017/Advances in Natural and Applied Sciences. 11(2) February 2017, Pages: 14-18 Categories of Intrusion Detection System: Misuse Detection based IDS: It is the most commonly used technique in modern world [10]. The terminology of this technique is to find the known attack pattern. After collecting the pattern and implement those in to identify various attack. But it exploit system weakness and attack software applications [2]. 1. Signature based approach: This method is popularly used in Intrusion detection system. This terminology is depending upon the creation of signature [3]. Signature creation is on the basis of associated traffic pattern code. It is used to detect malicious traffic but unfit for new type of attack [10]. 2. Anomaly based Intrusion Detection: It is used to detect the abnormal behavior. This technique is based on the profile of normal behavior. It is purely depend on the precaution distinguish between abnormal and normal activities [10]. Anomaly Detection System: The background of this system to find the pattern of data and conformed to expected behavior [5]. The patterns are called as anomalies, outliers, exceptions, surprises etc in different domain. Types of Anomaly: 1. Point Anomalies: The concept of this anomaly is to find the anomalous of individual data sets for a specific area. 2. Contextual Anomalies: This category is called as conditional anomaly for data instance. This is depend on two types of attributes is responsible for data instance. They are Contextual attributes and Behavioral attributes. 3. Collective Anomalies: It is a collection of data instance found as anomalous to entire data set. The occurrence of collection is anomalous themselves. It occurs only in relative dataset. In case the occurrence of Contextual anomalies depend on the feature of context attributes in the data [4]. Techniques of anomaly detection 1. Supervised anomaly detection: Supervised algorithms, whose performances highly depend on attack-free training data. However, this kind of working out data is difficult to obtain in real world network environment. Moreover, with changing network environment or services, patterns of normal traffic will be changed. This leads to high false positive rate of supervised ANIDSs. Anomaly detection can detect novel attacks to increase the detection rate. Compared to supervised approaches, unsupervised approach breaks the dependency on attack-free training datasets. 2. Unsupervised anomaly detection: Unsupervised approach have high false positive rate over supervised approach. Using unsupervised anomaly detection techniques, however, the system can be trained with unlabeled data and is capable of detecting previously unseen attacks. The performance of unsupervised anomaly detection approaches achieve higher detection rate over supervised approach. 3. Semi-Supervised anomaly detection: Semi-supervised, depends on training data instances for the normal class. They do not require label for the anomaly class is widely accessible than supervised technique. The approach is to create a model for the class to normal character to in the anomalies in datasets. Anomaly detection using Machine Learning Technique: Machine learning is a stream of artificial intelligence is a scientific discipline which deals with the design and development of algorithms that provide the computer access to evolve characteristics based on heterogeneous data from databases [13]. The Main initiative of the research finding is to provide a way to automate the system to identify the difficult pattern and make fine solution depends upon the data. It is related to statistics, probability theory, control and theoretical computer science [4].The impact of this algorithm to produce the generalize experience. And also incur to provide the details regarding performance bounds, computational learning theorists and also the complexity time and feasibility of study [13].Clustering is a technique used to find the data element to relevant group without proper understanding of the peer definitions [4].
17 Dr.P. Vaishnavi, et al., 2017/Advances in Natural and Applied Sciences. 11(2) February 2017, Pages: 14-18 Proposed Framework: This research paper focuses on to detect anomalous outlier in the weather data sets for rainfall system in a particular region. The primary goal of this is to find the critical information of changing climate events for a particular region in the state. Fig. 1: Proposed framework In the fig 1.0, the designed framework was able to identify the false positive values in the weather data sets and find the significant change in climate events due to seasonal variations for many years. The identified results imparts the extreme variability of weather conditions of rainfall events within the state and also to predict the false alarms on climate events leads to disagreement in policy decision within the state and also to take preventive measures for agricultural community. In this above framework rainfall data sets are collected from database and pushed into cleansing process. Anomaly detection is applied to the cleansed data. Classification is performed to segregate as positive values and false positive values. If false positive values are detected it moves onto the filtering process then finally it moves onto to the framework for predictive analytics. If no attacks are found, simply data are transferred to the framework. In the final stage, the predictive analytics is applied to find the future existence of data and Report is generated. Conclusion: The detection of outlier and false positive values provides the ultimate information about the specific problem especially in natural resource of identifying the changes in climate events. It provides a way to protect the environment from destruction and also to provide a chance to take necessary preventive measures in time. Because of false positive values in rainfall data sets for a specific region will result in false alert in taking policy decision and also creates unimaginable circumstances in future. The outlier detection also provides a chance to identify the future existence of the resources too and to study the impact of anomaly for changing climate events. REFERENCES 1. Aneetha, A.S. and Dr. S. Bose, 2012. The combined approach for anomaly detection using neural networks and clustering techniques International journal of computer science and Engineering, 2(4). 2. Hanumantha Rao, K., G. Srinivas, Ankam Damodhar and M. Vikas Krishna, 2011. Implementation of Anomaly detection technique using machine learning algorithm International journal of computer science and telecommunications 2: 3. 3. Seungmin Lee, Gisung Kim, Sehum Kim, 2011. Self adaptive and dynamic clustering for online anomaly detection, Elsevier, Expert System with Applications, 38(12): 14891-14898. 4. Prasanta gogoi, borah and Bhattacharyya, 2010. Anomaly detection analysis of intrusion data using supervised and unsupervised approach Journal of convergence information technology, 5: 1. 5. Varun chandola, Arindam banergee, 2009. Vipin kumar Anomaly Detection : A Survey ACM computing survey, University of Minnesota 6. A modified version of this technical report will appear in ACM Computing Surveys, 2009. Anomaly Detection : A Survey VARUN CHANDOLA University of Minnesota ARINDAM BANERJEE University of Minnesota and VIPIN KUMAR University of Minnesota 7. Adebayo, O., Adetunmbi, Samuel O. Falaki, Olumide S. Adewale and K. Boniface, 2008. Network Intrusion Detection based on Rough Set and k-nearest Neighbour, International Journal of Computing and ICT Research, 2(1): 60-66.
18 Dr.P. Vaishnavi, et al., 2017/Advances in Natural and Applied Sciences. 11(2) February 2017, Pages: 14-18 8. ZhiSong, P.A.N., L.I.A.N. Hong, H.U. GuYu, N.I. GuiQiang, 2005. An Integrated Model of Intrusion Detection Based on Neural Network and Expert System, Proceedings of the 17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 05), pp: 2-3. 9. Ozgur Depren, Murat Toppallar, Emin Anarim, M. kemal Ciliz, 2005. An Intelligent Intrusion Detection System (IDS) for anomaly and Misuse Detection in Computer Networks, Elsevier, Expert System with Applications, 29(4): 713-722. 10. Mitchell, T., 1997. Machine learning, McGraw Hill, ISBN0-07-042807-7. 11. Heady, R., G. Luger, A. Maccabe and M. Servila, 1990. The architecture of a network level intrusion deection system,computer science department, University of New Mexico, Tech. Rep. 12. Divya and Surender lakra, HSNORT: A Hybrid Intrusion Detection System using Artificial intelligence with Snort International journal of computer technology and application, 4(3): 466-470. 13. APHRODITE: an Anomaly-based Architecture for False Positive Reduction, Damiano Bolzoni, Sandro Etalle University of Twente.