Attribute Normalization in Network Intrusion Detection

Size: px
Start display at page:

Download "Attribute Normalization in Network Intrusion Detection"

Transcription

1 Attribute Normalization in Network Intrusion Detection Wei Wang Q2S Center in Communication Systems, Norwegian University of Science and Technology NTNU, Norway Svein J. Knapskog Q2S Center in Communication Systems, Norwegian University of Science and Technology NTNU, Norway Sylvain Gombault Département Réseaux, sécurité et multimédia, TELECOM Bretagne, France Abstract Anomaly intrusion detection is an important issue in computer network security. As a step of data preprocessing, attribute normalization is essential to detection performance. However, many anomaly detection methods do not normalize attributes before training and detection. Few methods consider to normalize the attributes but the question of which normalization method is more effective still remains. In this paper, we introduce four different schemes of attribute normalization to preprocess the data for anomaly intrusion detection. Three methods, k-nn, PCA as well as SVM, are then employed on the normalized data as well as on the data for comparison of the detection results. KDD Cup 19 data as well as a real data set collected in our department are used to evaluate the normalization schemes and the detection methods. The systematical evaluation results show that the process of attribute normalization improves a lot the detection performance. The normalization scheme is the best choice if the data set is large. The merits and demerits of the detection methods k-nn, PCA and SVM are also analyzed and discussed in this paper to suggest their suitable detection environments. I. INTRODUCTION Network security is becoming more and more important as networks have heavily involved in people s daily life and in all business processes within most organizations. As an important technique in the defense-in-depth network security framework [1], intrusion detection has become a widely studied topic in computer networks in recent years. In general, the techniques for intrusion detection fall into two major categories: signature-based detection and anomaly detection. Signature-based detection identifies malicious behavior by matching it against pre-defined description of attacks (signatures). Anomaly detection, on the other hand, defines a profile of a subject s normal activities and attempts to identify any unacceptable deviation as a potential attack. Any observable behavior of a system or a network, e.g., network traffic, audit logs, system calls, can be used as the subject information. Intrusion Detection Systems (IDS) can also be categorized as host-based IDSs and network-based IDSs according to the target environment for the monitoring. Host-based IDSs usually monitor the host system behavior by examining the information of the system, such as CPU time, system calls, keystroke and command sequences. Examples are [2][3][4][5]. Network-based IDSs, on the other hand, monitor network behavior usually by examining the content (e.g., payload [6]) as well as some attributes of network traffic [7]. In 19, Lee et al. [1] constructed 41 attributes from raw traffic data (i.e., tcpdump files) to build classification models for network based intrusion detection. The raw traffic data was collected at MIT Lincoln Laboratory for the 19 DARPA Intrusion Detection Evaluation program [8]. The 41 attributes have been shown to be promising for network intrusion detection [1] and the attribute sets of the network traffic have also been used as KDD Cup 19 data (The 19 Knowledge Discovery and Data Mining Tools Competition). The DARPA Intrusion Detection Evaluation [8] as well as the attributes (KDD Cup 19 data) provide a relatively good benchmark data set, not only for security research community, but also for the data mining research domain. Although the evaluation process has been criticized [9] for having some flaws, the data set is so far probably the only large-size, available and well labeled network data source in public. Many research groups have used the KDD Cup 19 data to validate their detection methods. Lee et al. [1] used Ripper to mine some detection rules from the attribute sets and to build misuse detection models. Jin et al.[10] utilized the covariance matrices of sequential samples to detect multiple network attacks. Katos [11] evaluated cluster, discriminant and logit analysis on the same KDD Cup 19 data for network intrusion detection. Bouzida and Cuppens [12] used Neural Networks as well as decision trees for network intrusion detection. Mukkamala et al. [13] evaluated performance of Artificial Neural Networks (ANNs), SVM and Multivariate Adaptive Regression Splines (MARS) on KDD Cup 19 data for network intrusion detection. Yang et al. [14] used TCM-KNN (Transductive Confidence Machines for K-Nearest Neighbors) and Ma et al. [15] used K-means, fuzzy C means clustering and fuzzy entropy clustering for intrusion detection. Liao et al. [16] used Fuzzy Adaptive Resonance Theory (ART) and Evolving Fuzzy Neural Networks (EFuNN) for intrusion detection. Shyu et al. [17] employed Principal Component Classifier for network intrusion detection also based on the KDD Cup 19 data. In our previous work [18][19], we used Principal Component Analysis (PCA) for network intrusion identification.

2 Data preprocessing is very important for anomaly intrusion detection and for many data mining related tasks. Data normalization is a essential step of data preprocessing for most anomaly detection algorithms that learns the attributes extracted from the audit data. Data normalization is to scale the values of each continuous attributes into a well-proportioned range such that the effect of one attribute cannot dominate the others. In KDD Cup 19 data, for example, the values of attribute dst bytes (number of data bytes from destination to source) ranges from 0 to or even larger, while the attribute same srv rate (number of connections to the same service) only ranges from 0 to 1. If the attributes are not normalized into the same (or similar) scale, one attribute (e.g., dst bytes ) may overwhelm all the others and this means that only one attribute is considered during the detection and the detection methods thus are not effective. Except the reference [1] that need the attributes to mine some detection rules, other references that use the attributes, [10], [11], [13], [17], [18] and [19] did not normalize the attributes before training and detection. Reference [14] and [15] used a normalization that converts the data into standard Normal distribution while Reference [16] converted the data into a range of 0 and 1. In this paper, we systematically evaluate the impact of different schemes of attribute normalization on the detection performance with three anomaly detection algorithms, PCA (Principal Component Analysis), k-nn (K-Nearest Neighbor), and one class SVM (Support Vector Machine). We introduce four different schemes of attributes normalization include mean normalization, normalization, normalization and value normalization. KDD Cup 19 data are used for the evaluation. The extensive experiments show that attribute normalization improves a lot the detection performance. Statistical normalization outperforms the other schemes if the data set is large. We also compare the performance of the three anomaly intrusion algorithms in this paper. In practical use, we detect DDoS attacks with the normalization methods in a real network and the testing results show its effectiveness. Our contributions are twofold. First, attribute normalization is very important for many anomaly detection tasks but it is often ignored. The comparison results with different schemes of attribute normalization presented in this paper provide useful references not only to the anomaly intrusion detection problem, but also to general classification problems that use attributes. To the best of our knowledge, this is the first study that evaluates the impact of attribute normalization on the classification performance. Second, we analyze the merits as well as the demerits of anomaly detection algorithms k-nn, PCA and SVM and suggest their most suitable environments for the detection. The remainder of this paper is organized as follows. Section 2 describes the schemes of attribute normalization. Section 3 briefly introduces the anomaly detection algorithms used in this paper. Extensive experiments based on KDD Cup 19 data are given in detail in Section 4. Experiments based on some real data are described in Section 5. Concluding remarks follow in Section 6. II. ATTRIBUTE NORMALIZATION SCHEMES There are generally four steps for anomaly intrusion detection: attribute construction, data preprocessing, model building and anomaly detection (see Fig. 1). This Section focuses on attribute normalization in the step of data preprocessing. Fig. 1. Steps of anomaly intrusion detection In this paper, besides the attributes, we use another four schemes of attribute normalization for anomaly intrusion detection. A. Mean If we know the maximum and minimum value of a given attribute, it is easy to transform the attribute into a range of value [0,1] by x i = v i min(v i ) max(v i ) min(v i ) where v i is the actual value of the attribute, and the maximum and minimum are taken over all values of the attribute. Normally x i is set to zero if the maximum is equal to the minimum. B. Statistical normalization The purpose of normalization is to convert data derived from any Normal distribution into standard Normal distribution with mean zero and unit variance. The normalization is defined as x i = v i µ σ where µ is mean of n values for a given attribute: µ = 1 n n i=1 v i. σ is its stand deviation σ = 1 n (v i µ) (3) n i=1 However, using normalization, the data set should follow a Normal distribution, that is, the number of sample n should be large according to central limit theorem [20]. The normalization does not scale the value of the attribute into [0,1]. It instead ranges.9% samples of the attribute into [-3, 3]. (1) (2)

3 C. Ordinal normalization Ordinal normalization is to rank the continuous value of an attribute and then normalize the rank into [0,1]. Let r be the rank of a given value in an attribute, the normalization is defined as x i = r 1 max(r) 1 Clearly, normalization also ranges the values of a attribute into [0,1]. In this paper, we do not increase the rank if some values of an attribute are the same. For an instance, if some values are ranked as {...,15,15,15}, the next rank is 16 other than 18. D. Frequency normalization Frequency normalization is to normalize an attribute by considering the proportion of a value to the summed value of the attribute. It is defined as x i = v i i v (5) i Frequency normalization also scales an attribute into [0,1]. III. ANOMALY INTRUSION DETECTION METHODS In this paper, we used PCA, k-nn, and one class SVM to evaluate the performance of different schemes of attribute normalization. Unlike other discriminative methods (e.g., decision tree) that learn the distinction between normal and abnormal, the three methods presented in this paper only build normal models and then use the model to detect anomalies. A. Anomaly detection with Principal Component Analysis (PCA) Principal Component Analysis (PCA) [21] is a widely used dimensionality reduction techniques for data analysis and compression. It is based on transforming a relatively large number of variables into a smaller number of uncorrelated variables by finding a few orthogonal linear combinations of the variables with the largest variance [22]. Given a set of observations be X 1,..., X i, and suppose each observation is represented by a row vector of length m (the number of attributes). The dataset is thus represented by a matrix X n m. The average observation is defined as µ = 1 n n n=1 X i. Observation deviation from the average is defined as Φ i = X i µ. The sample covariance matrix of the data set is defined as C = 1 n (X i µ)(x i µ) T (6) n n=1 Suppose (λ 1, µ 1 ), (λ 2, µ 2 ),..., (λ m, µ m ) are m eigenvalueeigenvector pairs of the sample covariance matrix C. We choose k eigenvectors having the largest eigenvalues. Often there will be just a few large eigenvalues, and this implies that k is the inherent dimensionality of the subspace governing the signal while the remaining (m k) dimensions generally (4) contain noise [22]. The dimensionality of the subspace k can be determined by [22] k i=1 λ i m i=1 λ α (7) i where α is the ratio of variation in the subspace to the total variation in the space. We form a (m k) (usually k m for data reduction) matrix U whose columns consist of the k eigenvectors. The representation of the data by principal components consists of projecting the data onto the k-dimensional subspace according to the following rules [22] Y i = (X i µ)u = ΦU (8) The number of principal eigenvectors U 1, U 2,..., U k, used to represent the distribution of the data, is determined by (7). Given an incoming vector T that represents a test sample, we project it onto the k-dimensional subspace representing the normal behavior according to the rules defined by (8). The distance between the test data vector and its reconstruction onto the subspace is the distance between the mean-adjusted input data vector Φ = T µ and Φ r = (T µ)uu T = ΦUU T (9) If the test data vector is normal, that is, if the test data vector is very similar to the training vectors corresponding to normal behavior, the test data vector and its reconstruction will be very similar and the distance between them will be very small. On this property our intrusion identification model is based. As PCA seeks a projection that best represents the data in a least-square sense, we use the squared Euclidean distance in the experiments to measure the distance between these two vectors ε = Φ Φ r 2 (10) ε is characterized as the anomaly index. If ε is below a predefined threshold, the vector is then identified as normal. Otherwise it is identified as anomalous. B. Anomaly detection with K-Nearest Neighbor (K-NN) K-Nearest Neighbor (k-nn) is a method for classifying objects based on closest training examples in the feature space. It is easily accessible and has been demonstrated effective for many classification tasks [22]. For a given k, k-nn ranks the neighbors of a test vector T among the training samples, and uses the class labels of the k most nearest neighbors to predict the class of the test vector. Euclidean distance is usually used for measuring the similarity between two vectors: d(t, X i ) = T X i = m (t i x ij ) 2 (11) i=1 where t i is the i-th variable in the test vector T ; X j is the vector j in the training data set and x ij is the i-th variable in sequences X j. In the experiments, we use a set of normal data as the training set and suppose that the normal behaviors are

4 embedded in the data sets. Given a test vector T, the Euclidean distance between the test vector and each vector in the training data set is calculated by (11). The distance scores are sorted and the k nearest neighbors are chosen to determine whether the test vector is normal or not. In anomaly detection, we average the k closest distance scores as the anomaly index. If the anomaly index of a test sequence vector is above a threshold ε 1, the test vector is then classified as abnormal. Otherwise it is considered as normal. C. Anomaly detection with one class Support Vector Machine (SVM) Support Vector Machine (SVM) is a very widely used method for classification. In this paper, we use one class SVM that was proposed by Schölkopf et al.[23]. One class SVM algorithm is to map the data into a feature space using an appropriate kernel function, and then to separate the mapped vectors from the origin with maximum margin. The algorithm returns a function f that takes the value +1 in a small region capturing most of the data vectors (e.g., training data), and 1 elsewhere [24]. Given training vectors X 1, X 2,..., X l belonging to normal class, the primal form of quadratic programming problem is Subject to min 1 2 ω vl l ξ i ρ (12) i=1 (ω Φ(X i )) ρ ξ i (13) where Φ is a kernel map that transforms the training examples to another space. After ω and ρ solve the problem, the decision function is f(x) = sgn((ω Φ(X)) ρ) (14) In anomaly detection, we use the normal data to build the normal model. If the decision function gives a positive value for a test vector T, the test data is classified as normal. Otherwise, it is considered as anomalous. A. Data sets IV. EXPERIMENTS ON KDD CUP 19 DATA As mentioned, although there are some criticism [9] towards the data, we used the data in our experiments based on two reasons. First, the data has been widely used for evaluating various intrusion detection methods and our detection results can be compared with others. Second, the data provides numerous types of anomalies. The raw data contains traffic in a simulated military network that consists of hundreds of hosts. We use a subset in the experiments. The raw data set we used was pre-processed into about 5 million connection records by Lee et al. [1] as part of the UCI KDD archive [25]. A connection is a sequence of TCP packets starting and ending at some well defined times, between which data flows from a source IP address to a target IP address under some well defined protocol [25]. In the data set, each network connection is labeled as either normal, or as an exactly one specific kind of attack. The network connection data contain 41 features. These features were divided into three groups: basic features of individual TCP connections, traffic features and content features within a connection suggested by the domain knowledge. Among these 41 features, 34 are numeric and 7 are symbolic. Only the 34 numeric features were used in the experiments. Each connection in the data set is thus transformed into a 34-dimensional vector as data input for detection. There are 494, 021 connection records in the training set in which, 278 are normal and 396,744 are attacks. There are 22 types of attacks in total in the subset and these attacks fall into one of 4 categories: DoS: denial-of-service (e.g., teardrop); R2L: unauthorized access from a remote machine (e.g., password guessing); U2R: unauthorized access to local superuser (root) privileges by a local unprivileged user (e.g., buffer overflow attacks) and PROBE: surveillance and other probing (e.g., port scanning). In a real computer network environment, collection of large amounts of precisely normal data is often difficult for a practical IDS. In the experiments, a smaller data set, 7,000 normal network connections, are thus randomly selected for training the normal model and a relatively large data set, randomly selected 10,000 normal network connections and 20% of DoS attacks as well as all the Probe, R2L and U2R attack data are used for detection. The data sets used in the experiments are described in Tab.1. TABLE I DATA DESCRIPTION Type Total (#) Training (#) Test (#) Normal,278 7,000 10,000 DoS 391, ,291 Probe 4, ,107 R2L 1, U2R B. Parameters and criterion of evaluation In the experiments reported in this paper, we used the same training data for training and the same test data for testing to guarantee a fair comparison. The parameters in the detection are very important. In the experiments, for PCA, we use ratio α as.9% as it is most desirable based on our previous experimental results [18][19]. For k-nn, we set k = 10 as this is a good choice [4]. For SVM, we use the kernel as the RBF and adjust parameters v to obtain the different results. We made our own programs for k-nn and PCA. We used LibSVM tools (Version 2.88)[26] for SVM. We use Receiver Operating Characteristic (ROC) curves to evaluate the detection performance. The ROC curve is the plot of Detection Rates (DR), calculated as the percentage of intrusions detected, against False Positive Rates (FPR), calculated as the percentage of normal connections falsely classified as intrusions. Points nearer to the upper left corner of the ROC curve are the most desirable. There is a tradeoff between the DR and FPR and the ROC curve is obtained by setting different thresholds.

5 C. Evaluation on methods of attribute normalization We used four schemes defined in Section 2 to normalize the data. The normalized data as well as the data are then fed into k-nn, PCA and SVM methods for training and testing. With different attribute normalization schemes, The overall detection results using k-nn, PCA and SVM are presented in Fig As probe attack detection is difficult to detect [19], we also present results of probe attack detection in Fig From Fig.2-7, it is clear that attribute normalization improves the detection performance for all the detection methods. In details, attribute normalization improves a lot the detection performance for k-nn and SVM based anomaly detection, while normalization help little for PCA method. K-NN and SVM based detection methods mainly compute the distances of two vectors and the detection results are very sensitive to the scale of the attributes. In contrast, PCA seeks new major coordinates and it is not very sensitive to the normalization because we use α as.9% such that it captures most of variance contained in the data. Statistical normalization is the best except normalization for probe attack detection with PCA. Statistical normalization not only considers the mean scale of attribute values, but also takes into account their distribution and this may help a lot for the detection. In general, for the detection with distance based methods such as k-nn and SVM, normalization is the best choice; is the second; while and normalization is not very effective but are still better than attributes Fig. 3. Overall detection results with PCA. Fig. 4. Overall detection results with SVM Fig. 2. Overall detection Results with k-nn Fig. 5. Probe attack detection results with k-nn. D. Evaluation on methods of anomaly detection We compare the detection results with only attribute normalization based on different anomaly detection, k-nn, PCA and SVM. The overall detection results as well as probe attack detection results are shown in Fig. 8 and Fig. 9 respectively. From the figures, it is seen that, k-nn is better than SVM and PCA in terms of detection accuracy. The testing results show that k-nn, SVM and PCA all achieve satisfactory results. V. DETECTING DDOS ATTACKS IN A REAL NETWORK As important work on DDoS attack analysis and detection project in the Institute, we collected various major DDoS attack tools and implemented them in the laboratory to collect network traffic of DDoS attacks. The attack tools are Trinoo, TFN, Stacheldraht, TFN2K and Mstream. Using these tools, we implement DDoS attacks with ICMP flood, SYN flood,

6 SVM 75 K NN PCA 70 Fig. 6. Probe attack detection results with PCA. Fig. 9. SVM. Comparison: Probe attack detection results with k-nn, PCA and our experiments. After the raw data 2 was collected and the attributes were constructed, we randomly selected 5,000 normal connections for training and 10,000 normal connections as well as 36,3 DDoS attack connections for testing. Based on the results of Section 4, we use k-nn for DDoS attack detection with or without attribute normalization for comparison and the results are summarized in Fig.10. From the figure, it is clear that attribute normalization improve a lot the detection results and k-nn can achieve a good results with attribute normalization. Fig. 7. Probe attack detection results with SVM SVM K NN PCA Statistical normalization Original data Fig. 10. Result of k-nn method: normalization vs. data. Fig. 8. Comparison: Overall detection results with k-nn, PCA and SVM, using normalization. UDP flood, Steam (TCP-ACK flood) and Smurf style attacks. A large set of normal as well as DDoS attack network traffic are then collected for analysis. In the experiments, we use a tool 1 to transform the raw tcpdump traffic files into connection records with 41 attributes (defined in [1]). We only use the 34 continuous attributes in [12]. 1 The attribute construction programs were written by our team members VI. CONCLUDING REMARKS Anomaly intrusion detection is a pattern classification in nature. Attribute construction and classification methods are usually the core issues. The classification methods should correspond to the attributes for effective detection. Many methods have been successfully employed for anomaly detection. However, the question of whether attribute normalization is essential with respect to the detection performance still remains. If it is essential, the question is changed to what method of attribute normalization is most effective. 2 The data are available upon request to the authors.

7 This paper tries to provide answers to the above two questions by case studies. The answers can be applied to other general classification problem, not only to anomaly detection. In our experiments, we used 4 schemes to normalize the attributes. k-nn, PCA and SVM are employed as anomaly intrusion detection methods. KDD Cup 19 data are first used for the testing. The experiment results show that attribute normalization improves the detection performance with k-nn, PCA and SVM. Jin [26] suggests that data be scaled in a mean fashion (e.g., mean ) before using the LibSVM tools. Our experiments show that normalization is the better choice if the data sample is large, even mean can also improve the detection performance. A large data set collected from real networks are also used in the experiments for DDoS attack detection and the results are consistent with the previous findings. Through our work, we suggest that attribute normalization should always be considered for the classification problem. There are some exceptions for the needs of attribute normalization. For example, some machine learning algorithms (e.g., decision trees) require the attribute to mine some rules. In this case attribute normalization cannot be conducted due to the fact that the information may be lost. Another exception is that one part of data cannot be provided or the data is of streaming type. In this case, because the data is incomplete or the a data is streaming in real-time and we cannot calculate the mean and standard deviation for normalization. For the detection methods we used in this paper, k-nn does not need a training process as is the case for PCA and SVM. The computation complexity of k-nn for the detection is O(pqm), where p is the number of events in the test set, q is the number of events in the training set and m is the dimensionality of the events. It is clear that k-nn needs a lot of computation if the data is very high-dimensional and the amount of training samples is very large. The computation for PCA and SVM, on the other hand, is relatively time consuming during the training process, but much less calculations is needed during the test process. Moreover, system resources may be largely saved for compressed normal models for PCA and SVM. It is suggested that PCA is more suitable for processing large of amount of data for anomaly intrusion detection. However, as a easily used method, k-nn is appropriate for intrusion detection if the data is not so massive. k-nn is also light-weight so that it is feasible to periodically retrain the detection model only by incorporating new training data. For the future work, we will try to design a more effective scheme of attribute normalization that may not only consider the distribution of one attribute, but also take into account the cross properties among all the attributes in the data set. How to normalize streaming data is also being investigated. ACKNOWLEDGMENT The work of NTNU part was supported by the Centre for Quantifiable Quality of Service (Q2S) in Communication Systems, Centre of Excellence, which is appointed by the Research Council of Norway and funded by the Research Council, NTNU and UNINETT. The research of the first author is also supported by ERCIM fellowship program. The work of TELECOM part was supported by ACI DADDi Project. REFERENCES [1] Lee, W., Stolfo, S.J., Mok, K.W.: A data mining framework for building intrusion detection models. In: IEEE Symposium on Security and Privacy. (19) [2] Forrest, S., Hofmeyr, S.A., Somayaji, A., Longstaff, T.A.: A sense of self for unix processes. In: IEEE Symposium on Security and Privacy. (16) [3] Schonlau, M., Theus, M.: Detecting masquerades in intrusion detection based on unpopular commands. Inf. Process. Lett. 76(1-2) (2000) [4] Wang, W., Gombault, S.: Distance measures for anomaly intrusion detection. In: Security and Management. (2007) [5] Ingham, K.L., Inoue, H.: Comparing anomaly detection techniques for http. In: RAID. (2007) [6] Wang, K., Stolfo, S.J.: Anomalous payload-based network intrusion detection. In: RAID. (2004) [7] Nassar, M., State, R., Festor, O.: Monitoring sip traffic using support vector machines. In: RAID. (2008) [8] MIT: Mit lincoln laboratory-darpa intrusion detection evaluation (retrieved march 2009). (19) [9] McHugh, J.: Testing intrusion detection systems: a critique of the 19 and 19 darpa intrusion detection system evaluations as performed by lincoln laboratory. ACM Trans. Inf. Syst. Secur. 3(4) (2000) [10] Jin, S., Yeung, D.S., Wang, X.: Network intrusion detection in covariance feature space. Pattern Recognition 40(8) (2007) [11] Katos, V.: Network intrusion detection: Evaluating cluster, discriminant, and logit analysis. Inf. Sci. 177(15) (2007) [12] Bouzida, Y., Cuppens, F.: Neural networks vs. decision trees for intrusion detection. In: Proceedings of the first IEEE workshop on Monitoring, Attack Detection and Mitigation. (2006) [13] Mukkamala, S., Sung, A.H., Abraham, A.: Intrusion detection using an ensemble of intelligent paradigms. J. Network and Computer Applications 28(2) (2005) [14] Li, Y., Fang, B., Guo, L., Chen, Y.: Network anomaly detection based on tcm-knn algorithm. In: ASIACCS. (2007) [15] Ma, W., Tran, D., Sharma, D.: A study on the feature selection of network traffic for intrusion detection purpose. In: ISI. (2008) [16] Liao, Y., Vemuri, V.R., Pasos, A.: Adaptive anomaly detection with evolving connectionist systems. J. Network and Computer Applications 30(1) (2007) 60 [17] Shyu, M., Chen, S., Sarinnapakorn, K., Chang, L.: A novel anomaly detection scheme based on principal component classifier. In: IEEE Foundations and New Directions of Data Mining Workshop. (2003) 60 [18] Wang, W., Battiti, R.: Identifying intrusions in computer networks with principal component analysis. In: ARES. (2006) [19] Wang, W., Guan, X., Zhang, X.: Processing of massive audit data streams for real-time anomaly intrusion detection. Computer Communications 31(1) (2008) [20] Durrett, R.: Probability : Theory and Examples. Wadsworth, Pacific Grove, California (11) [21] Jolliffe, I.T.: Principal Component Analysis. 2nd edn. Springer-Verlag, Berlin (2002) [22] Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. 2nd edn. China Machine Press (2004) [23] Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Computation 13(7) (2001) [24] Manevitz, L.M., Yousef, M.: One-class svms for document classification. Journal of Machine Learning Research 2 (2001) [25] KDD-Data: Kdd cup 19 data (retrieved march 2009). (19) [26] Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. (2001) Software available at cjlin/libsvm.

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Hybrid Feature Selection for Modeling Intrusion Detection Systems Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,

More information

A study on fuzzy intrusion detection

A study on fuzzy intrusion detection A study on fuzzy intrusion detection J.T. Yao S.L. Zhao L. V. Saxton Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: [jtyao,zhao200s,saxton]@cs.uregina.ca

More information

Lecture Notes on Critique of 1998 and 1999 DARPA IDS Evaluations

Lecture Notes on Critique of 1998 and 1999 DARPA IDS Evaluations Lecture Notes on Critique of 1998 and 1999 DARPA IDS Evaluations Prateek Saxena March 3 2008 1 The Problems Today s lecture is on the discussion of the critique on 1998 and 1999 DARPA IDS evaluations conducted

More information

Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets

Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets H. Günes Kayacık, A. Nur Zincir-Heywood, Malcolm I. Heywood Dalhousie University, Faculty

More information

Detection of DDoS Attack on the Client Side Using Support Vector Machine

Detection of DDoS Attack on the Client Side Using Support Vector Machine Detection of DDoS Attack on the Client Side Using Support Vector Machine Donghoon Kim * and Ki Young Lee** *Department of Information and Telecommunication Engineering, Incheon National University, Incheon,

More information

Application of the Generic Feature Selection Measure in Detection of Web Attacks

Application of the Generic Feature Selection Measure in Detection of Web Attacks Application of the Generic Feature Selection Measure in Detection of Web Attacks Hai Thanh Nguyen 1, Carmen Torrano-Gimenez 2, Gonzalo Alvarez 2 Slobodan Petrović 1, and Katrin Franke 1 1 Norwegian Information

More information

Kernel Methods and Visualization for Interval Data Mining

Kernel Methods and Visualization for Interval Data Mining Kernel Methods and Visualization for Interval Data Mining Thanh-Nghi Do 1 and François Poulet 2 1 College of Information Technology, Can Tho University, 1 Ly Tu Trong Street, Can Tho, VietNam (e-mail:

More information

Anomaly Intrusion Detection System Using Hierarchical Gaussian Mixture Model

Anomaly Intrusion Detection System Using Hierarchical Gaussian Mixture Model 264 IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.8, August 2008 Anomaly Intrusion Detection System Using Hierarchical Gaussian Mixture Model M. Bahrololum and M. Khaleghi

More information

A Detailed Analysis on NSL-KDD Dataset Using Various Machine Learning Techniques for Intrusion Detection

A Detailed Analysis on NSL-KDD Dataset Using Various Machine Learning Techniques for Intrusion Detection A Detailed Analysis on NSL-KDD Dataset Using Various Machine Learning Techniques for Intrusion Detection S. Revathi Ph.D. Research Scholar PG and Research, Department of Computer Science Government Arts

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

A Comparison Between the Silhouette Index and the Davies-Bouldin Index in Labelling IDS Clusters

A Comparison Between the Silhouette Index and the Davies-Bouldin Index in Labelling IDS Clusters A Comparison Between the Silhouette Index and the Davies-Bouldin Index in Labelling IDS Clusters Slobodan Petrović NISlab, Department of Computer Science and Media Technology, Gjøvik University College,

More information

Combining Cross-Correlation and Fuzzy Classification to Detect Distributed Denial-of-Service Attacks*

Combining Cross-Correlation and Fuzzy Classification to Detect Distributed Denial-of-Service Attacks* Combining Cross-Correlation and Fuzzy Classification to Detect Distributed Denial-of-Service Attacks* Wei Wei 1, Yabo Dong 1, Dongming Lu 1, and Guang Jin 2 1 College of Compute Science and Technology,

More information

Review on Data Mining Techniques for Intrusion Detection System

Review on Data Mining Techniques for Intrusion Detection System Review on Data Mining Techniques for Intrusion Detection System Sandeep D 1, M. S. Chaudhari 2 Research Scholar, Dept. of Computer Science, P.B.C.E, Nagpur, India 1 HoD, Dept. of Computer Science, P.B.C.E,

More information

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Statistical Analysis of Metabolomics Data Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Outline Introduction Data pre-treatment 1. Normalization 2. Centering,

More information

Tensor Sparse PCA and Face Recognition: A Novel Approach

Tensor Sparse PCA and Face Recognition: A Novel Approach Tensor Sparse PCA and Face Recognition: A Novel Approach Loc Tran Laboratoire CHArt EA4004 EPHE-PSL University, France tran0398@umn.edu Linh Tran Ho Chi Minh University of Technology, Vietnam linhtran.ut@gmail.com

More information

Feature Selection in the Corrected KDD -dataset

Feature Selection in the Corrected KDD -dataset Feature Selection in the Corrected KDD -dataset ZARGARI, Shahrzad Available from Sheffield Hallam University Research Archive (SHURA) at: http://shura.shu.ac.uk/17048/ This document is the author deposited

More information

The Effects of Outliers on Support Vector Machines

The Effects of Outliers on Support Vector Machines The Effects of Outliers on Support Vector Machines Josh Hoak jrhoak@gmail.com Portland State University Abstract. Many techniques have been developed for mitigating the effects of outliers on the results

More information

K-Nearest-Neighbours with a Novel Similarity Measure for Intrusion Detection

K-Nearest-Neighbours with a Novel Similarity Measure for Intrusion Detection K-Nearest-Neighbours with a Novel Similarity Measure for Intrusion Detection Zhenghui Ma School of Computer Science The University of Birmingham Edgbaston, B15 2TT Birmingham, UK Ata Kaban School of Computer

More information

Intrusion Detection and Malware Analysis

Intrusion Detection and Malware Analysis Intrusion Detection and Malware Analysis Anomaly-based IDS Pavel Laskov Wilhelm Schickard Institute for Computer Science Taxonomy of anomaly-based IDS Features: Packet headers Byte streams Syntactic events

More information

Ensemble of Soft Computing Techniques for Intrusion Detection. Ensemble of Soft Computing Techniques for Intrusion Detection

Ensemble of Soft Computing Techniques for Intrusion Detection. Ensemble of Soft Computing Techniques for Intrusion Detection Global Journal of Computer Science and Technology Network, Web & Security Volume 13 Issue 13 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July-2013 ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July-2013 ISSN 1 Review: Boosting Classifiers For Intrusion Detection Richa Rawat, Anurag Jain ABSTRACT Network and host intrusion detection systems monitor malicious activities and the management station is a technique

More information

Enhanced Multivariate Correlation Analysis (MCA) Based Denialof-Service

Enhanced Multivariate Correlation Analysis (MCA) Based Denialof-Service International Journal of Computer Science & Mechatronics A peer reviewed International Journal Article Available online www.ijcsm.in smsamspublications.com Vol.1.Issue 2. 2015 Enhanced Multivariate Correlation

More information

KBSVM: KMeans-based SVM for Business Intelligence

KBSVM: KMeans-based SVM for Business Intelligence Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2004 Proceedings Americas Conference on Information Systems (AMCIS) December 2004 KBSVM: KMeans-based SVM for Business Intelligence

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

FACE RECOGNITION USING SUPPORT VECTOR MACHINES

FACE RECOGNITION USING SUPPORT VECTOR MACHINES FACE RECOGNITION USING SUPPORT VECTOR MACHINES Ashwin Swaminathan ashwins@umd.edu ENEE633: Statistical and Neural Pattern Recognition Instructor : Prof. Rama Chellappa Project 2, Part (b) 1. INTRODUCTION

More information

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Ms. Gayatri Attarde 1, Prof. Aarti Deshpande 2 M. E Student, Department of Computer Engineering, GHRCCEM, University

More information

Dimension Reduction CS534

Dimension Reduction CS534 Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of

More information

Modeling Intrusion Detection Systems With Machine Learning And Selected Attributes

Modeling Intrusion Detection Systems With Machine Learning And Selected Attributes Modeling Intrusion Detection Systems With Machine Learning And Selected Attributes Thaksen J. Parvat USET G.G.S.Indratrastha University Dwarka, New Delhi 78 pthaksen.sit@sinhgad.edu Abstract Intrusion

More information

CS 195-5: Machine Learning Problem Set 5

CS 195-5: Machine Learning Problem Set 5 CS 195-5: Machine Learning Problem Set 5 Douglas Lanman dlanman@brown.edu 26 November 26 1 Clustering and Vector Quantization Problem 1 Part 1: In this problem we will apply Vector Quantization (VQ) to

More information

Anomaly Detection on Data Streams with High Dimensional Data Environment

Anomaly Detection on Data Streams with High Dimensional Data Environment Anomaly Detection on Data Streams with High Dimensional Data Environment Mr. D. Gokul Prasath 1, Dr. R. Sivaraj, M.E, Ph.D., 2 Department of CSE, Velalar College of Engineering & Technology, Erode 1 Assistant

More information

Detection of Anomalies using Online Oversampling PCA

Detection of Anomalies using Online Oversampling PCA Detection of Anomalies using Online Oversampling PCA Miss Supriya A. Bagane, Prof. Sonali Patil Abstract Anomaly detection is the process of identifying unexpected behavior and it is an important research

More information

Basic Concepts in Intrusion Detection

Basic Concepts in Intrusion Detection Technology Technical Information Services Security Engineering Roma, L Università Roma Tor Vergata, 23 Aprile 2007 Basic Concepts in Intrusion Detection JOVAN GOLIĆ Outline 2 Introduction Classification

More information

Abnormal Network Traffic Detection Based on Semi-Supervised Machine Learning

Abnormal Network Traffic Detection Based on Semi-Supervised Machine Learning 2017 International Conference on Electronic, Control, Automation and Mechanical Engineering (ECAME 2017) ISBN: 978-1-60595-523-0 Abnormal Network Traffic Detection Based on Semi-Supervised Machine Learning

More information

Analyzing TCP Traffic Patterns Using Self Organizing Maps

Analyzing TCP Traffic Patterns Using Self Organizing Maps Analyzing TCP Traffic Patterns Using Self Organizing Maps Stefano Zanero D.E.I.-Politecnico di Milano, via Ponzio 34/5-20133 Milano Italy zanero@elet.polimi.it Abstract. The continuous evolution of the

More information

Local Linear Approximation for Kernel Methods: The Railway Kernel

Local Linear Approximation for Kernel Methods: The Railway Kernel Local Linear Approximation for Kernel Methods: The Railway Kernel Alberto Muñoz 1,JavierGonzález 1, and Isaac Martín de Diego 1 University Carlos III de Madrid, c/ Madrid 16, 890 Getafe, Spain {alberto.munoz,

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Feature Selection Using Modified-MCA Based Scoring Metric for Classification

Feature Selection Using Modified-MCA Based Scoring Metric for Classification 2011 International Conference on Information Communication and Management IPCSIT vol.16 (2011) (2011) IACSIT Press, Singapore Feature Selection Using Modified-MCA Based Scoring Metric for Classification

More information

COMPARISON OF THE ACCURACY OF BIVARIATE REGRESSION AND BOX PLOT ANALYSIS IN DETECTING DDOS ATTACKS

COMPARISON OF THE ACCURACY OF BIVARIATE REGRESSION AND BOX PLOT ANALYSIS IN DETECTING DDOS ATTACKS International Journal of Electronics and Communication Engineering & Technology (IJECET) Volume 6, Issue 12, Dec 2015, pp. 43-48, Article ID: IJECET_06_12_007 Available online at http://www.iaeme.com/ijecetissues.asp?jtype=ijecet&vtype=6&itype=12

More information

Dimension Reduction in Network Attacks Detection Systems

Dimension Reduction in Network Attacks Detection Systems Nonlinear Phenomena in Complex Systems, vol. 17, no. 3 (2014), pp. 284-289 Dimension Reduction in Network Attacks Detection Systems V. V. Platonov and P. O. Semenov Saint-Petersburg State Polytechnic University,

More information

Bagging for One-Class Learning

Bagging for One-Class Learning Bagging for One-Class Learning David Kamm December 13, 2008 1 Introduction Consider the following outlier detection problem: suppose you are given an unlabeled data set and make the assumptions that one

More information

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN 2016 International Conference on Artificial Intelligence: Techniques and Applications (AITA 2016) ISBN: 978-1-60595-389-2 Face Recognition Using Vector Quantization Histogram and Support Vector Machine

More information

A Network Intrusion Detection System Architecture Based on Snort and. Computational Intelligence

A Network Intrusion Detection System Architecture Based on Snort and. Computational Intelligence 2nd International Conference on Electronics, Network and Computer Engineering (ICENCE 206) A Network Intrusion Detection System Architecture Based on Snort and Computational Intelligence Tao Liu, a, Da

More information

Diagonal Principal Component Analysis for Face Recognition

Diagonal Principal Component Analysis for Face Recognition Diagonal Principal Component nalysis for Face Recognition Daoqiang Zhang,2, Zhi-Hua Zhou * and Songcan Chen 2 National Laboratory for Novel Software echnology Nanjing University, Nanjing 20093, China 2

More information

Collateral Representative Subspace Projection Modeling for Supervised Classification

Collateral Representative Subspace Projection Modeling for Supervised Classification Collateral Representative Subspace Projection Modeling for Supervised Classification Thiago Quirino, Zongxing Xie, Mei-Ling Shyu Department of Electrical and Computer Engineering University of Miami, Coral

More information

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data Nian Zhang and Lara Thompson Department of Electrical and Computer Engineering, University

More information

Data Distortion for Privacy Protection in a Terrorist Analysis System

Data Distortion for Privacy Protection in a Terrorist Analysis System Data Distortion for Privacy Protection in a Terrorist Analysis System Shuting Xu, Jun Zhang, Dianwei Han, and Jie Wang Department of Computer Science, University of Kentucky, Lexington KY 40506-0046, USA

More information

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

The Comparative Study of Machine Learning Algorithms in Text Data Classification* The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification

More information

Face Recognition Based on LDA and Improved Pairwise-Constrained Multiple Metric Learning Method

Face Recognition Based on LDA and Improved Pairwise-Constrained Multiple Metric Learning Method Journal of Information Hiding and Multimedia Signal Processing c 2016 ISSN 2073-4212 Ubiquitous International Volume 7, Number 5, September 2016 Face Recognition ased on LDA and Improved Pairwise-Constrained

More information

Intrusion Detection System based on Support Vector Machine and BN-KDD Data Set

Intrusion Detection System based on Support Vector Machine and BN-KDD Data Set Intrusion Detection System based on Support Vector Machine and BN-KDD Data Set Razieh Baradaran, Department of information technology, university of Qom, Qom, Iran R.baradaran@stu.qom.ac.ir Mahdieh HajiMohammadHosseini,

More information

A Distance-Based Classifier Using Dissimilarity Based on Class Conditional Probability and Within-Class Variation. Kwanyong Lee 1 and Hyeyoung Park 2

A Distance-Based Classifier Using Dissimilarity Based on Class Conditional Probability and Within-Class Variation. Kwanyong Lee 1 and Hyeyoung Park 2 A Distance-Based Classifier Using Dissimilarity Based on Class Conditional Probability and Within-Class Variation Kwanyong Lee 1 and Hyeyoung Park 2 1. Department of Computer Science, Korea National Open

More information

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Lu Chen and Yuan Hang PERFORMANCE DEGRADATION ASSESSMENT AND FAULT DIAGNOSIS OF BEARING BASED ON EMD AND PCA-SOM.

More information

Content-based image and video analysis. Machine learning

Content-based image and video analysis. Machine learning Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all

More information

EVALUATION OF INTRUSION DETECTION TECHNIQUES AND ALGORITHMS IN TERMS OF PERFORMANCE AND EFFICIENCY THROUGH DATA MINING

EVALUATION OF INTRUSION DETECTION TECHNIQUES AND ALGORITHMS IN TERMS OF PERFORMANCE AND EFFICIENCY THROUGH DATA MINING International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol. 3, Issue 2, Jun 2013, 47-54 TJPRC Pvt. Ltd. EVALUATION OF INTRUSION DETECTION TECHNIQUES

More information

Network Intrusion Detection Using Fast k-nearest Neighbor Classifier

Network Intrusion Detection Using Fast k-nearest Neighbor Classifier Network Intrusion Detection Using Fast k-nearest Neighbor Classifier K. Swathi 1, D. Sree Lakshmi 2 1,2 Asst. Professor, Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada Abstract: Fast

More information

A Data Mining Framework for Building Intrusion Detection Models

A Data Mining Framework for Building Intrusion Detection Models A Data Mining Framework for Building Intrusion Detection Models Wenke Lee Salvatore J. Stolfo Kui W. Mok Computer Science Department, Columbia University 500 West 120th Street, New York, NY 10027 {wenke,sal,mok}@cs.columbia.edu

More information

Face Recognition using Eigenfaces SMAI Course Project

Face Recognition using Eigenfaces SMAI Course Project Face Recognition using Eigenfaces SMAI Course Project Satarupa Guha IIIT Hyderabad 201307566 satarupa.guha@research.iiit.ac.in Ayushi Dalmia IIIT Hyderabad 201307565 ayushi.dalmia@research.iiit.ac.in Abstract

More information

Model Selection for Anomaly Detection in Wireless Ad Hoc Networks

Model Selection for Anomaly Detection in Wireless Ad Hoc Networks Model Selection for Anomaly Detection in Wireless Ad Hoc Networks Hongmei Deng, Roger Xu Intelligent Automation Inc., Rockville, MD 2855 {hdeng, hgxu}@i-a-i.com Abstract-Anomaly detection has been actively

More information

An Ensemble Data Mining Approach for Intrusion Detection in a Computer Network

An Ensemble Data Mining Approach for Intrusion Detection in a Computer Network International Journal of Science and Engineering Investigations vol. 6, issue 62, March 2017 ISSN: 2251-8843 An Ensemble Data Mining Approach for Intrusion Detection in a Computer Network Abisola Ayomide

More information

Detection of Network Intrusions with PCA and Probabilistic SOM

Detection of Network Intrusions with PCA and Probabilistic SOM Detection of Network Intrusions with PCA and Probabilistic SOM Palakollu Srinivasarao M.Tech, Computer Networks and Information Security, MVGR College Of Engineering, AP, INDIA ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

Linear Discriminant Analysis for 3D Face Recognition System

Linear Discriminant Analysis for 3D Face Recognition System Linear Discriminant Analysis for 3D Face Recognition System 3.1 Introduction Face recognition and verification have been at the top of the research agenda of the computer vision community in recent times.

More information

Analysis of TCP Segment Header Based Attack Using Proposed Model

Analysis of TCP Segment Header Based Attack Using Proposed Model Chapter 4 Analysis of TCP Segment Header Based Attack Using Proposed Model 4.0 Introduction Though TCP has been extensively used for the wired network but is being used for mobile Adhoc network in the

More information

Bayesian Learning Networks Approach to Cybercrime Detection

Bayesian Learning Networks Approach to Cybercrime Detection Bayesian Learning Networks Approach to Cybercrime Detection N S ABOUZAKHAR, A GANI and G MANSON The Centre for Mobile Communications Research (C4MCR), University of Sheffield, Sheffield Regent Court, 211

More information

Feature Selection Using Principal Feature Analysis

Feature Selection Using Principal Feature Analysis Feature Selection Using Principal Feature Analysis Ira Cohen Qi Tian Xiang Sean Zhou Thomas S. Huang Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign Urbana,

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

Object and Action Detection from a Single Example

Object and Action Detection from a Single Example Object and Action Detection from a Single Example Peyman Milanfar* EE Department University of California, Santa Cruz *Joint work with Hae Jong Seo AFOSR Program Review, June 4-5, 29 Take a look at this:

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

McPAD and HMM-Web: two different approaches for the detection of attacks against Web applications

McPAD and HMM-Web: two different approaches for the detection of attacks against Web applications McPAD and HMM-Web: two different approaches for the detection of attacks against Web applications Davide Ariu, Igino Corona, Giorgio Giacinto, Fabio Roli University of Cagliari, Dept. of Electrical and

More information

Learning to Recognize Faces in Realistic Conditions

Learning to Recognize Faces in Realistic Conditions 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Fuzzy Bidirectional Weighted Sum for Face Recognition

Fuzzy Bidirectional Weighted Sum for Face Recognition Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 447-452 447 Fuzzy Bidirectional Weighted Sum for Face Recognition Open Access Pengli Lu

More information

Software Documentation of the Potential Support Vector Machine

Software Documentation of the Potential Support Vector Machine Software Documentation of the Potential Support Vector Machine Tilman Knebel and Sepp Hochreiter Department of Electrical Engineering and Computer Science Technische Universität Berlin 10587 Berlin, Germany

More information

Use of Multi-category Proximal SVM for Data Set Reduction

Use of Multi-category Proximal SVM for Data Set Reduction Use of Multi-category Proximal SVM for Data Set Reduction S.V.N Vishwanathan and M Narasimha Murty Department of Computer Science and Automation, Indian Institute of Science, Bangalore 560 012, India Abstract.

More information

Mahalanobis Distance Map Approach for Anomaly Detection

Mahalanobis Distance Map Approach for Anomaly Detection Edith Cowan University Research Online Australian Information Security Management Conference Conferences, Symposia and Campus Events 2010 Mahalanobis Distance Map Approach for Anomaly Detection Aruna Jamdagnil

More information

SCENARIO BASED ADAPTIVE PREPROCESSING FOR STREAM DATA USING SVM CLASSIFIER

SCENARIO BASED ADAPTIVE PREPROCESSING FOR STREAM DATA USING SVM CLASSIFIER SCENARIO BASED ADAPTIVE PREPROCESSING FOR STREAM DATA USING SVM CLASSIFIER P.Radhabai Mrs.M.Priya Packialatha Dr.G.Geetha PG Student Assistant Professor Professor Dept of Computer Science and Engg Dept

More information

Towards Traffic Anomaly Detection via Reinforcement Learning and Data Flow

Towards Traffic Anomaly Detection via Reinforcement Learning and Data Flow Towards Traffic Anomaly Detection via Reinforcement Learning and Data Flow Arturo Servin Computer Science, University of York aservin@cs.york.ac.uk Abstract. Protection of computer networks against security

More information

A Comparative Study of Supervised and Unsupervised Learning Schemes for Intrusion Detection. NIS Research Group Reza Sadoddin, Farnaz Gharibian, and

A Comparative Study of Supervised and Unsupervised Learning Schemes for Intrusion Detection. NIS Research Group Reza Sadoddin, Farnaz Gharibian, and A Comparative Study of Supervised and Unsupervised Learning Schemes for Intrusion Detection NIS Research Group Reza Sadoddin, Farnaz Gharibian, and Agenda Brief Overview Machine Learning Techniques Clustering/Classification

More information

EVALUATIONS OF THE EFFECTIVENESS OF ANOMALY BASED INTRUSION DETECTION SYSTEMS BASED ON AN ADAPTIVE KNN ALGORITHM

EVALUATIONS OF THE EFFECTIVENESS OF ANOMALY BASED INTRUSION DETECTION SYSTEMS BASED ON AN ADAPTIVE KNN ALGORITHM EVALUATIONS OF THE EFFECTIVENESS OF ANOMALY BASED INTRUSION DETECTION SYSTEMS BASED ON AN ADAPTIVE KNN ALGORITHM Assosiate professor, PhD Evgeniya Nikolova, BFU Assosiate professor, PhD Veselina Jecheva,

More information

Linear Discriminant Analysis in Ottoman Alphabet Character Recognition

Linear Discriminant Analysis in Ottoman Alphabet Character Recognition Linear Discriminant Analysis in Ottoman Alphabet Character Recognition ZEYNEB KURT, H. IREM TURKMEN, M. ELIF KARSLIGIL Department of Computer Engineering, Yildiz Technical University, 34349 Besiktas /

More information

"GET /cgi-bin/purchase?itemid=109agfe111;ypcat%20passwd mail 200

GET /cgi-bin/purchase?itemid=109agfe111;ypcat%20passwd mail 200 128.111.41.15 "GET /cgi-bin/purchase? itemid=1a6f62e612&cc=mastercard" 200 128.111.43.24 "GET /cgi-bin/purchase?itemid=61d2b836c0&cc=visa" 200 128.111.48.69 "GET /cgi-bin/purchase? itemid=a625f27110&cc=mastercard"

More information

Noise-based Feature Perturbation as a Selection Method for Microarray Data

Noise-based Feature Perturbation as a Selection Method for Microarray Data Noise-based Feature Perturbation as a Selection Method for Microarray Data Li Chen 1, Dmitry B. Goldgof 1, Lawrence O. Hall 1, and Steven A. Eschrich 2 1 Department of Computer Science and Engineering

More information

Face Detection using Hierarchical SVM

Face Detection using Hierarchical SVM Face Detection using Hierarchical SVM ECE 795 Pattern Recognition Christos Kyrkou Fall Semester 2010 1. Introduction Face detection in video is the process of detecting and classifying small images extracted

More information

Effective Intrusion Type Identification with Edit Distance for HMM-Based Anomaly Detection System

Effective Intrusion Type Identification with Edit Distance for HMM-Based Anomaly Detection System Effective Intrusion Type Identification with Edit Distance for HMM-Based Anomaly Detection System Ja-Min Koo and Sung-Bae Cho Dept. of Computer Science, Yonsei University, Shinchon-dong, Seodaemoon-ku,

More information

Approach Using Genetic Algorithm for Intrusion Detection System

Approach Using Genetic Algorithm for Intrusion Detection System Approach Using Genetic Algorithm for Intrusion Detection System 544 Abhijeet Karve Government College of Engineering, Aurangabad, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, Maharashtra-

More information

Two Level Anomaly Detection Classifier

Two Level Anomaly Detection Classifier Two Level Anomaly Detection Classifier Azeem Khan Dublin City University School of Computing Dublin, Ireland raeeska2@computing.dcu.ie Shehroz Khan Department of Information Technology National University

More information

Mining TCP/IP Traffic for Network Intrusion Detection by Using a Distributed Genetic Algorithm

Mining TCP/IP Traffic for Network Intrusion Detection by Using a Distributed Genetic Algorithm Mining TCP/IP Traffic for Network Intrusion Detection by Using a Distributed Genetic Algorithm Filippo Neri DSTA - University of Piemonte Orientale Corso Borsalino 54, 15100 Alessandria (AL), Italy neri@di.unito.it

More information

Generating the Reduced Set by Systematic Sampling

Generating the Reduced Set by Systematic Sampling Generating the Reduced Set by Systematic Sampling Chien-Chung Chang and Yuh-Jye Lee Email: {D9115009, yuh-jye}@mail.ntust.edu.tw Department of Computer Science and Information Engineering National Taiwan

More information

CS570: Introduction to Data Mining

CS570: Introduction to Data Mining CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.

More information

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1225 Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms S. Sathiya Keerthi Abstract This paper

More information

A Rough Set Based Feature Selection on KDD CUP 99 Data Set

A Rough Set Based Feature Selection on KDD CUP 99 Data Set Vol.8, No.1 (2015), pp.149-156 http://dx.doi.org/10.14257/ijdta.2015.8.1.16 A Rough Set Based Feature Selection on KDD CUP 99 Data Set Vinod Rampure 1 and Akhilesh Tiwari 2 Department of CSE & IT, Madhav

More information

ANALYSIS AND EVALUATION OF DISTRIBUTED DENIAL OF SERVICE ATTACKS IDENTIFICATION METHODS

ANALYSIS AND EVALUATION OF DISTRIBUTED DENIAL OF SERVICE ATTACKS IDENTIFICATION METHODS ANALYSIS AND EVALUATION OF DISTRIBUTED DENIAL OF SERVICE ATTACKS IDENTIFICATION METHODS Saulius Grusnys, Ingrida Lagzdinyte Kaunas University of Technology, Department of Computer Networks, Studentu 50,

More information

Robust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma

Robust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma Robust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma Presented by Hu Han Jan. 30 2014 For CSE 902 by Prof. Anil K. Jain: Selected

More information

A New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines

A New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines A New Fuzzy Membership Computation Method for Fuzzy Support Vector Machines Trung Le, Dat Tran, Wanli Ma and Dharmendra Sharma Faculty of Information Sciences and Engineering University of Canberra, Australia

More information

Time Series Classification in Dissimilarity Spaces

Time Series Classification in Dissimilarity Spaces Proceedings 1st International Workshop on Advanced Analytics and Learning on Temporal Data AALTD 2015 Time Series Classification in Dissimilarity Spaces Brijnesh J. Jain and Stephan Spiegel Berlin Institute

More information

Bagging and Boosting Algorithms for Support Vector Machine Classifiers

Bagging and Boosting Algorithms for Support Vector Machine Classifiers Bagging and Boosting Algorithms for Support Vector Machine Classifiers Noritaka SHIGEI and Hiromi MIYAJIMA Dept. of Electrical and Electronics Engineering, Kagoshima University 1-21-40, Korimoto, Kagoshima

More information

Artificial Neural Networks (Feedforward Nets)

Artificial Neural Networks (Feedforward Nets) Artificial Neural Networks (Feedforward Nets) y w 03-1 w 13 y 1 w 23 y 2 w 01 w 21 w 22 w 02-1 w 11 w 12-1 x 1 x 2 6.034 - Spring 1 Single Perceptron Unit y w 0 w 1 w n w 2 w 3 x 0 =1 x 1 x 2 x 3... x

More information

PCA and KPCA algorithms for Face Recognition A Survey

PCA and KPCA algorithms for Face Recognition A Survey PCA and KPCA algorithms for Face Recognition A Survey Surabhi M. Dhokai 1, Vaishali B.Vala 2,Vatsal H. Shah 3 1 Department of Information Technology, BVM Engineering College, surabhidhokai@gmail.com 2

More information

Deep Tensor: Eliciting New Insights from Graph Data that Express Relationships between People and Things

Deep Tensor: Eliciting New Insights from Graph Data that Express Relationships between People and Things Deep Tensor: Eliciting New Insights from Graph Data that Express Relationships between People and Things Koji Maruhashi An important problem in information and communications technology (ICT) is classifying

More information

Spatial Topology of Equitemporal Points on Signatures for Retrieval

Spatial Topology of Equitemporal Points on Signatures for Retrieval Spatial Topology of Equitemporal Points on Signatures for Retrieval D.S. Guru, H.N. Prakash, and T.N. Vikram Dept of Studies in Computer Science,University of Mysore, Mysore - 570 006, India dsg@compsci.uni-mysore.ac.in,

More information

Decision tree based learning and Genetic based learning to detect network intrusions

Decision tree based learning and Genetic based learning to detect network intrusions Decision tree based learning and Genetic based learning to detect network intrusions Filippo Neri University of Piemonte Orientale - DSTA via Bellini 25/G, 15100 Alessandria AL, Italy ABSTRACT Abstract

More information

Combining SVMs with Various Feature Selection Strategies

Combining SVMs with Various Feature Selection Strategies Combining SVMs with Various Feature Selection Strategies Yi-Wei Chen and Chih-Jen Lin Department of Computer Science, National Taiwan University, Taipei 106, Taiwan Summary. This article investigates the

More information