INTRUSION DETECTION MODEL IN DATA MINING BASED ON ENSEMBLE APPROACH

Similar documents
Combination of Three Machine Learning Algorithms for Intrusion Detection Systems in Computer Networks

An Ensemble Data Mining Approach for Intrusion Detection in a Computer Network

Intrusion detection system with decision tree and combine method algorithm

INTRUSION DETECTION WITH TREE-BASED DATA MINING CLASSIFICATION TECHNIQUES BY USING KDD DATASET

A Rough Set Based Feature Selection on KDD CUP 99 Data Set

International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July-2013 ISSN

Modeling Intrusion Detection Systems With Machine Learning And Selected Attributes

Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets

Ranking and Filtering the Selected Attributes for Intrusion Detection System

IDuFG: Introducing an Intrusion Detection using Hybrid Fuzzy Genetic Approach

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) PROPOSED HYBRID-MULTISTAGES NIDS TECHNIQUES

Comparative Analysis of Classification Algorithms on KDD 99 Data Set

Two Level Anomaly Detection Classifier

CHAPTER V KDD CUP 99 DATASET. With the widespread use of computer networks, the number of attacks has grown

Detection of Network Intrusions with PCA and Probabilistic SOM

Classification of Attacks in Data Mining

Unsupervised clustering approach for network anomaly detection

Review on Data Mining Techniques for Intrusion Detection System

Intrusion Detection System Using New Ensemble Boosting Approach

Network attack analysis via k-means clustering

Intrusion Detection Based On Clustering Algorithm

International Journal of Computer Engineering and Applications, Volume XI, Issue XII, Dec. 17, ISSN

Study on Classifiers using Genetic Algorithm and Class based Rules Generation

Anomaly Intrusion Detection System Using Hierarchical Gaussian Mixture Model

EVALUATION OF INTRUSION DETECTION TECHNIQUES AND ALGORITHMS IN TERMS OF PERFORMANCE AND EFFICIENCY THROUGH DATA MINING

RUSMA MULYADI. Advisor: Dr. Daniel Zeng

A Network Intrusion Detection System Architecture Based on Snort and. Computational Intelligence

Intrusion Detection System with FGA and MLP Algorithm

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 4, Issue 7, January 2015

An Efficient Decision Tree Model for Classification of Attacks with Feature Selection

Data Mining Approaches for Network Intrusion Detection: from Dimensionality Reduction to Misuse and Anomaly Detection

Analysis of neural networks usage for detection of a new attack in IDS

Hybrid Feature Selection for Modeling Intrusion Detection Systems

A Data Mining Approach for Intrusion Detection System Using Boosted Decision Tree Approach

REVIEW OF VARIOUS INTRUSION DETECTION METHODS FOR TRAINING DATA SETS

Effect of Principle Component Analysis and Support Vector Machine in Software Fault Prediction

A Technique by using Neuro-Fuzzy Inference System for Intrusion Detection and Forensics

Intrusion Detection Using Data Mining Technique (Classification)

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

GP Ensemble for Distributed Intrusion Detection Systems

CHAPTER 4 DATA PREPROCESSING AND FEATURE SELECTION

Cluster Based detection of Attack IDS using Data Mining

NETWORK INTRUSION DETECTION SYSTEM BASED ON MODIFIED RANDOM FOREST CLASSIFIERS FOR KDD CUP-99 AND NSL-KDD DATASET

A Survey And Comparative Analysis Of Data

Keywords Intrusion Detection System, Artificial Neural Network, Multi-Layer Perceptron. Apriori algorithm

Cyber Attack Detection and Classification Using Parallel Support Vector Machine

INTRUSION DETECTION SYSTEM

Data Mining Based Online Intrusion Detection

HYBRID INTRUSION DETECTION USING SIGNATURE AND ANOMALY BASED SYSTEMS

Performance Analysis of Big Data Intrusion Detection System over Random Forest Algorithm

COMPARISON OF DIFFERENT CLASSIFICATION TECHNIQUES

Big Data Analytics: Feature Selection and Machine Learning for Intrusion Detection On Microsoft Azure Platform

Important Roles Of Data Mining Techniques For Anomaly Intrusion Detection System

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

Multiple Classifier Fusion With Cuttlefish Algorithm Based Feature Selection

Intrusion detection in computer networks through a hybrid approach of data mining and decision trees

Hybrid Network Intrusion Detection for DoS Attacks

A Detailed Analysis on NSL-KDD Dataset Using Various Machine Learning Techniques for Intrusion Detection

Evaluation of Intrusion Detection Algorithms for Interoperability Gateways in Ad Hoc Networks

Intrusion Detection System using AI and Machine Learning Algorithm

CHAPTER 5 CONTRIBUTORY ANALYSIS OF NSL-KDD CUP DATA SET

Fast Feature Reduction in Intrusion Detection Datasets

Abnormal Network Traffic Detection Based on Semi-Supervised Machine Learning

An Active Rule Approach for Network Intrusion Detection with Enhanced C4.5 Algorithm

Comparison of variable learning rate and Levenberg-Marquardt back-propagation training algorithms for detecting attacks in Intrusion Detection Systems

A Review on Performance Comparison of Artificial Intelligence Techniques Used for Intrusion Detection

Detection and Classification of Attacks in Unauthorized Accesses

Approach Using Genetic Algorithm for Intrusion Detection System

Comparative Study on Classification Meta Algorithms

Analysis of Feature Selection Techniques: A Data Mining Approach

An Evolutionary Ensemble Approach for Distributed Intrusion Detection

Deep Feature Extraction for multi-class Intrusion Detection in Industrial Control Systems

Lecture Notes on Critique of 1998 and 1999 DARPA IDS Evaluations

Dr. Prof. El-Bahlul Emhemed Fgee Supervisor, Computer Department, Libyan Academy, Libya

Pramod Bide 1, Rajashree Shedge 2 1,2 Department of Computer Engg, Ramrao Adik Institute of technology/mumbai University, India

AGRICULTURAL SOIL LIME STATUS ANALYSIS USING DATA MINING CLASSIFICATION TECHNIQUES

Ensemble of Soft Computing Techniques for Intrusion Detection. Ensemble of Soft Computing Techniques for Intrusion Detection

CHAPTER 2 DARPA KDDCUP99 DATASET

Performance Analysis of Data Mining Classification Techniques

CHAPTER 7 Normalization of Dataset

Towards an Efficient Anomaly-Based Intrusion Detection for Software-Defined Networks

International Journal of Scientific & Engineering Research, Volume 6, Issue 6, June ISSN

A Review on Intrusion Detection System Based Data Mining Techniques

Analysis of KDD 99 Intrusion Detection Dataset for Selection of Relevance Features

NETWORK FAULT DETECTION - A CASE FOR DATA MINING

IMPLEMENTATION OF CLASSIFICATION ALGORITHMS USING WEKA NAÏVE BAYES CLASSIFIER

Optimization using Ant Colony Algorithm

Flow-based Anomaly Intrusion Detection System Using Neural Network

Intrusion Detection System based on Enhanced PLS Feature Extraction with Hybrid classification Method

Classification Trees with Logistic Regression Functions for Network Based Intrusion Detection System

Intrusion Detection System Based on K-Star Classifier and Feature Set Reduction

Network Intrusion Detection Using Deep Neural Networks

A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES

Diversified Intrusion Detection with Various Detection Methodologies Using Sensor Fusion

Towards A New Architecture of Detecting Networks Intrusion Based on Neural Network

Double Guard: Detecting intrusions in Multitier web applications with Security

INTRUSION DETECTION SYSTEM USING DECISION TREE AND APRIORI ALGORITHM

Deep Learning Approach to Network Intrusion Detection

IEE 520 Data Mining. Project Report. Shilpa Madhavan Shinde

Classifying Twitter Data in Multiple Classes Based On Sentiment Class Labels

Transcription:

INTRUSION DETECTION MODEL IN DATA MINING BASED ON ENSEMBLE APPROACH VIKAS SANNADY 1, POONAM GUPTA 2 1Asst.Professor, Department of Computer Science, GTBCPTE, Bilaspur, chhattisgarh, India 2Asst.Professor, Department of Computer Science, GTBCPTE, Bilaspur, chhattishgarh, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Nowadays the need for secure networks is tremendously increased so Security of computers and the networks that connect them is increasingly becoming of great significance. Computer attack has become very common. Although there are many existing mechanisms for Intrusion detection, but the major issues is the security and accuracy of the system. In data mining based intrusion detection system we should have thorough knowledge about the particular domain in relation to intrusion detection so as to efficiently extract relative rule from huge amounts of records. In this paper we investigate and evaluate the ensemble bagging data mining techniques as an intrusion detection mechanism. Our research shows that bagging decision trees gives better overall performance. Key Words: Bagging ensemble approach, intrusion detection System. established baseline. The baseline will identify what is normal for that network [1]. This technique aims to design and develop security architecture for computer networks. We build the model to improve the classification rate for known attacks with minimum number of false alarm rate. We train and test our proposed model on the normal and the known attacks in NSL-KDD datasets. The proposed system should have an adaptive capability. LITERATURE SURVEY In this portion we discuss the literature survey on Classification techniques applied on the IDS data. There are many research papers published regarding the classifiers in order to detect the intrusions in the network dataset. The following are the some of the related works described below. 1.INTRODUCTION An intrusion detection system (IDS) monitors network traffic and monitors for suspicious activity and alerts the system or network administrator. In some cases the IDS may also respond to anomalous or malicious traffic by taking action such as blocking the user or source IP address from accessing the network. There are various types of intrusion detection techniques - NIDS: Network Intrusion Detection Systems are placed at a strategic point or points within the network to monitor traffic to and from all devices on the network. HIDS: Host Intrusion Detection Systems are run on individual hosts or devices on the network. Signature Based: Signature based IDS will monitor packets on the network and compare them against a database of signatures or attributes from known malicious threats. This is similar to the way most antivirus software detects malware. Anomaly Based: An IDS which is anomaly based will monitor network traffic and compare it against an Manikandan R et.al [2] proposed a new Ensemble boosted decision tree approach for intrusion detection system. Amit Kumar et.al [3] proposed the purpose of an intrusion detection system is to detect attacks. However, it is equally important to detect attacks at an early stage in order to minimize their impact. I have used Dataset and Classifier to refine Intruders in Networks.There are a variety of approaches of intrusion detection, such as Pattern Matching, Machine Learning, Data Mining, and Measure Based Methods. This paper aims towards the proper survey of IDS so that researchers can make use of it and find the new techniques towards intrusions. Zibusiso Dewa et.al [4] this article gives an overview of existing Intrusion Detection Systems (IDS) along with their main principles. Also this article argues whether data mining and its core feature which is knowledge discovery can help in creating Data mining based IDSs that can achieve higher accuracy to novel types of intrusion and demonstrate more robust behaviour compared to traditional IDS. Poonam Gupta et.al [5] in this paper they investigate and evaluate the decision tree data mining techniques as an intrusion detection mechanism. Our research shows that Decision trees gives better overall performance. 2016, IRJET ISO 9001:2008 Certified Journal Page 1654

Vivek Nandan Tiwari et.al[6] This constraint has guide researchers in the IDS society to not only extend better detection algorithms and signature tuning methods, tuning methods, but to also focus on determining a variety of relations between individual alerts, formally known as alert correlation. Priya U. Kadam et.al [7] proposes effectiveness and accuracy of an approach to generate rules for different types of anomalous connection. The KDDCUP99 training and testing dataset is used to generate effective new rules by adopting reasonable detection rate. Sandhya Peddabachigari et.al [8] In this paper we investigate and evaluate the decision tree data mining techniques as an intrusion detection mechanism and we Compare it with Support Vector Machines (SVM). Intrusion detection with Decision Trees and SVM were tested with benchmark 1998 DARPA Intrusion Detection dataset. Our research shows that Decision trees gives better overall performance than the SVM. Heba Ezzat Ibrahim et.al [9] our experimental results showed that the proposed multi-layer model using C5 decision tree achieves higher classification rate accuracy, using feature selection by Gain Ratio, and less false alarm rate than MLP and naïve Bayes. Using Gain Ratio enhances the accuracy of U2R and R2L for the three machine learning techniques (C5, MLP and Naïve Bayes) significantly. MLP has high classification rate when using the whole 41 features in Dos and Probe layers. K.Nageswara rao et.al [10] in this paper, we evaluated the influence of attribute pre-selection using Statistical techniques on real-world kddcup99 data set. Experimental result shows that accuracy of the C4.5 classifier could be improved with the robust pre-selection approach when compare to traditional feature selection techniques. PROPOSED ARCHITECTURE Proposed Research work introduces a new framework for offline analysis as shown in fig 1. For network intrusion detection. In this framework KDD99cup [11] dataset is given to Preprocessing stage which includes bagging technique. OUR APPROACH This proposed model uses bagging decision tree i.e. hoeffding tree classification techniques to increase performance of the intrusion detection system. An ensemble model is a combination of two or more models to avoid the drawbacks of individual models and to achieve high accuracy. The two models are combined by using high confidential wins scheme where weights are weighted based on the confidence value of each prediction. Then the weights are summed and the value with highest total is again selected. The confidence for the final selection is the sum of the weights for the winning values divided by the number of Fig 1: Proposed Architecture for Bagging Technique. models included in the ensemble model. In this work many models with the combination of all the above individual models are tried and finally an ensemble model is selected, because this model produces highest accuracy among all the ensemble models as well as individual models [12]. BAGGING For each trial t= 1, 2 T. a training set of size N is sampled (with replacement) from the original s. This training set is the same size as the original data but some may not appear in at. While others appears more than once. The learning system generates a classifier C t 2016, IRJET ISO 9001:2008 Certified Journal Page 1655

from the sample and the final classifiers C * is formed by aggregating the T classifier from these trails. To classify an x, a vote for class k is recorded by every classifier for which C t (x) =k and C * (x) is then the class with the most votes. Using CART as the learning system, breiman (1996) reports result of bagging on seven moderate size datasets. With the number of the replicates T set at 50 the average error of the bagged classifier C * ranges from 0.57 to 0.94 of the corresponding error when a single classifier is learned. Breiman introduces the concept of an ordered-correct classifier learning system as one that over many training sets tends to predict the correct class of a test more frequently than any other class. An order correct learner may not produce optimal classifier but breiman shows that aggregating classifier produced by an order correct learner result in an optimal classifier. Breiman notes: The vital element is the instability of the prediction method if perturbing the learning set can cause significant changes in the predictor constructed then bagging can improve accuracy [13]. exactly one specific attack type. In total, 41 features have been used in KDD99 dataset and each connection can be categorized into five main classes as one normal class and four main intrusion classes as DOS, U2R, R2L and Probe. There are 22 different types of attacks that are grouped into the four main types of attacks DOS, U2R, R2L and Probe tabulated in Table I[2]. Table I. Different Types of Attacks Main Attack Classes DOS Denial of service U2R User to Root R2L Remote to user Probe 22 Different Attack types Back, land, Neptune, pod, smurf, teardrop Buffer_ overflow, loadmodule, perl, rootkit ftp_write, guess_password, imap, multihop, phf, spy. Ipsweep, nmap, portsweep EXPERIMENTAL RESULTS Evaluation on KDDCup 99 Data Set The experiment is carried out on a intrusion detection real data stream which has been use in the Knowledge Discovery and Data Mining (KDD) 1999 Cup competition. In KDD99 dataset the input data flow contains the details of the network connections, such as protocol type, connection duration, login type etc. Each data sample in KDD99 dataset represents attribute value of a class in the network data flow, and each class is labeled either as normal or as an attack with Datasets used for testing Correctly Incorrectly Cross Validation Technique The experimental setting is for the KDD99 Cup, taking 10% of the whole real raw data Streams are selected as per proposed algorithm. Table II: Testing the system by cross validation datasets- However, in this experiment k=10 have highest accuracy & its Confusion Matrix is also given below TP Rate FP Rate Precision Recall F-Measure ROC K=2 99.583 0.416 K=4 99.607 0.393 K=6 99.611 0.389 K=8 99.626 0.373 K=10 99.642 0.357 2016, IRJET ISO 9001:2008 Certified Journal Page 1656

Confusion Matrix for k=10:- a b as 13414 35 a=normal 55 11688 b=anomaly Table III: Testing the system by splitting datasets on different percentage Percentage Split Technique Percentage Split Correctly Incorrectly TP Rate FP Rate Precision Recall F-Measure ROC 50% 99.507 0.492 0.995 0.005 0.995 0.995 0.995 0.999 60% 99.593 0.406 70% 99.589 0.410 80% 99.761 0.238 0.998 0.003 0.998 0.998 0.998 0.999 Confusion Matrix for 80%:- a b as 2675 3 a=normal 9 2351 b=anomaly CONCLUSION Intrusion detection classification is a crucial and essential task now a days for security. Data mining techniques provides facility to design and develop predictive model for intrusion detection classification. This paper explores various data mining techniques to design an ensemble model for classification of intrusion related security. A testing accuracy of model show the efficiency of ensemble model. Models are also measured in terms of confusion matirix. Result show alternative as a instrusion detection that ensemble model gives higher accuracy than using single model. REFERENCES [1]http://netsecurity.about.com/cs/hackertools/a/aa03050 4_2.htm. [2] Manikandan R, Oviya P, Hemalatha C, A New Data Mining Based Network Intrusion Detection Model Journal of Computer Applications ISSN: 0974 1925, Volume-5, Issue EICA2012-1, February 10, 2012. [3] Amit Kumar, Harish Chandra Maurya, Rahul Misra, A Research Paper on Hybrid Intrusion Detection System,International Journal of Engineering and Advanced 2016, IRJET ISO 9001:2008 Certified Journal Page 1657

Technology (IJEAT) ISSN: 2249 8958, Volume-2, Issue-4, April 2013. [4] Zibusiso Dewa, Leandros A. Maglaras, Data Mining and Intrusion DetectionSystems, International Journal of Advanced Computer Science and Applications, Vol. 7, No 1,2016 [5] Poonam Gupta, S. R. Tandan, Rohit Miri, Decision Tree Applied For Detecting Intrusion, International Journal of Engineering Research & Technology (IJERT) Vol. 2 Issue 5, May - 2013 ISSN: 2278-0181. [6] Vivek Nandan Tiwari, Prof. Kailash Patidar,Prof.Satyendra Rathore, Prof. Kumar Yadav, A Comprehensive Survey of Intrusion Detection Systems, Computer Engineering and Intelligent Systems ISSN 2222-1719 (Paper) ISSN 2222-2863 Vol.7, No.1, 2016 [7] Priya U. Kadam, P. P. Jadhav, An effective rule generation for Intrusion Detection System using Genetics Algorithm, International Journal of Science, Engineering and Technology Research (IJSETR) Volume 2, Issue 10, October 2013 [8]Sandhya Peddabachigari, Ajith Abraham*, Johnson Thomas, Intrusion Detection Systems Using Decision Trees and Support Vector Machines. [9] ]Heba Ezzat Ibrahim,Sherif M. Badr, Mohamed A. Shaheen, Adaptive Layered Approach using Machine Learning Techniques with Gain Ratio for Intrusion Detection Systems, International Journal of Detection Systems Using Decision Trees and Support Vector Machines. [10]K.Nageswara rao, D.RajyaLakshmi, T.Venkateswara Rao, Robust Statistical Outlier based Feature Selection Technique for Network Intrusion Detection, International Journal of Soft Computing and Engineering (IJSCE) ISSN: 2231-2307, Volume-2, Issue-1, March 2012. [11] "NSL-KDD data set for network-based intrusion detection systems, Available on: http://nsl.cs.unb.ca/nsl- KDD. [12] J.R.Quinlan, Bagging,Boosting, C4.5. [13] H.S.Hota, Diagnosis of Breast Cancer Using Intelligent Techniques, International Journal of Emerging Science and Engineering (IJESE) ISSN: 2319 6378, Volume-1, Issue-3, January 2013. BIOGRAPHIES Vikas Sannady has received his Bachelor degree in Computer Science from Gurughasi Das University, 2010. And master Degree from Gurughasi Das Central University, 2013. He is currently working as an Asst. Pro. In the Department of Computer Science. His current research interest includes data mining in cyber security. Poonam Gupta has received her B.E. degree in Computer Science & Engg. from C.S.V.T.U. University, 2011. And M.Tech Degree from Dr. C.V.Raman University, 2013. She is currently working as an Asst. Pro. in the Department of Computer Science. Her current research interest includes data mining in cyber security. 2016, IRJET ISO 9001:2008 Certified Journal Page 1658