Clustering of Windows Security Events by means of Frequent Pattern Mining
|
|
- Matthew Small
- 5 years ago
- Views:
Transcription
1 Clustering of Windows Security Events by means of Frequent Pattern Mining Rosa Basagoiti 1, Urko Zurutuza 1, Asier Aztiria 1, Guzmán Santafé 2 and Mario Reyes 2 1 Mondragon University, Mondragon, Spain {rbasagoiti;uzurutuza;aaztiria}@eps.mondragon.edu 2 Grupo S21sec Gestión S.A., Orcoyen, Spain {gsantafe;mreyes}@s21sec.com Abstract. This paper summarizes the results obtained from the application of Data Mining techniques in order to detect usual behaviors in the use of computers. For that, based on real security event logs, two different clustering strategies have been developed. On the one hand, a clustering process has been carried out taking into account the characteristics that define the events in a quantitative way. On the other hand, an approach based on qualitative aspects has been developed, mainly based on the interruptions among security events. Both approaches have shown to be effective and complementary in order to cluster security audit trails of Windows systems and extract useful behavior patterns. Key words: Windows security event analysis, data mining, frequent pattern mining, intrusion detection, anomaly detection 1 Introduction The idea of discovering behavioral patterns from a set of event logs in order to detect unusual behavior or malicious events is not novel. In fact, the idea came up in the 80s when James P. Anderson, in a seminal work in the area of Intrusion Detection Systems [1], suggested that the common behavior of a user could be portrayed analyzing the set of event logs generated during his/her use of computer. Thereby, unusual events, out of such usual behavior could be considered as attacks or at least as unusual. There are many works in this sense, but most of them have been developed considering Unix systems. This paper focuses on events produced by Windows operative systems. The complexity of such systems is even bigger due to the large amount of data they usually generate. In this work, different experiments have been carried out considering two different approaches. On the one hand, we have created clusters based on characteristics which summary the activity from a quantitative point of view. The second approach tries to find out logical clusters analyzing the interruptions among events. The reminder of this paper is organized as follows. Section 2 provides a literature review of different tools and approaches when performing the analysis
2 2 R. Basagoiti et al. of log data. In Section 3 we analyse the nature of the problem and we define some aspects to be considered. Section 4 describes the experiments and the results we have obtained. Finally, Section 5 provides some conclusions and ongoing challenges. 2 Related Work The research in Intrusion Detection began in the 1980s when Anderson suggested that the normal behavior of a user could be characterized analyzing his/her usual set of event logs. Since then, the area has attracted a significant number of researchers. The first application to detect unusual events or attacks was named IDES (Intrusion Detection Expert System) and it was developed by Dorothy Denning [2]. The basic idea of such a system was to monitor the normal activity in a mainframe and based on those activities define a set of rules which would allow the detection of anomalies. It is worth mentioning that currently not only the core of the problem keeps being the same, but the complexity of the systems has increased considerably. Whereas Denning s approach suggested to analyze the event logs of a mainframe where the users were connected to, currently a system is composed by a lot of servers and workstations where each one creates its own event logs. More systems that used data mining algorithms on event logs were proposed, but all them were based on centralized Unix events. In [3] a method for discovering temporal patterns in event sequences was proposed. Debar et al. proposed a system which could analyze the behavior of user activity using neural networks [4]. Neural networks were also used for anomaly detection based on Solaris BSM (Basic Security Module) audit data [5]. Lee and Stolfo used in [6] audit data from Unix machines to create behavior patterns using association rules and frequent episode mining, this way a set of events that occurred in a given time window could be discovered. In [7] Lane investigated the use of Hidden Markov Models for user pattern generation. The source of the event logs used turns as the main difference with our proposed work. Almost the 90% of the population uses Windows systems, and the events are stored in each host. The complexity of centralizing and analyzing this information increases significantly. Also, our approach focuses on discovering the behavior of the hosts, and not the users related to them. This way we do not focus only on the usage patterns for intrusion detection, but more on any anomalous behavior that could happen (i.e. misconfigurations). In order to allow the centralization of all this information and make easier the use of it, Security Information Management (SIM) tools have been developed. Currently, there are many applications developed with the purpose of detecting unusual behaviors. Tools such as Tripwire 3, Samhain 4, GFI EventsManager 5 3 Tripwire: 4 Samhain: 5 GFI Events Manager:
3 Title Suppressed Due to Excessive Length 3 and specially OSSEC 6 and Cisco MARS (Cisco Security Monitoring, Analysis, and Response System) 7 are an example of it. Nevertheless, only GFI Events- Manager, OSSEC and Cisco MARS can be used in Windows environments and their strategies to analyze need to be improved. These tools, except Cisco MARS, are mainly focused on monitoring modifications in configuration, administration actions, identification of system errors and suspicious security problems. But, neither of them has the ability to generate sequential models which allow to detect unusual events. In this sense, different approaches have tried to discover the correlation between events [8]. Even some of them have worked with summarized data [9]. Specific tools for mining event logs have also been developed [10]. Other options that have been studied are the use of techniques used in temporal series mining [11] or the use of techniques for mining frequent itemsets [12]. It is clear the need of a system which clusters logically the security event logs generated in Windows systems. Therefore in the following sections we describe an approach to classify and correlate such events so that they can be used for further applications. 3 Analysis of Windows security event logs Windows classifies the events in different categories that are stored in independent records, such as System Registry, Application Registry, DNS Registry and Security Registry. This paper focuses on the events stored in the security registry, such as session logons or changes of privileges. It can be activated from the Administrator of domain users (NT) or security guidelines (W2K, W2K3) and it is available in all the versions of Windows Professional and Server. Each event contains information like type of event, date and time information, event source (the software that has registered the event), category, event that has been produced (event ID), user who has produced and station where the event has occurred. Finally, Windows allows to define nine different categories related to security events. Account logon events: This event defines the authentication of a user from the point of view of the system. A single event of this type is not very meaningful but if there are many attempts in a short period of time, it can mean a scan activity or brute force attack. Account management: Activity related to the creation, management and delete of individual user accounts or groups of users. Directory service access: Access to any object that contains System Access Control Lists (SACL). Logon events: User authentication activity coming from local station as well as from the system that triggered the activity in a network. Object access: Access to file system and objects of the registry. It provides an easy to use tool to register changes in sensible files. 6 OSSEC: 7 Cisco MARS:
4 4 R. Basagoiti et al. Policy changes: Changes in the access policy and some other modifications. Privilege use: Windows allows to define granular permissions to carry out specific tasks. Process tracking: It generates detailed information about when a process starts and finishes or when the programs are activated. System events: It registers information that affects the integrity of the system. In this work, we are going to consider events generated by 4 different Domain Controllers (DC) during 4 days. From this point on, these servers will be named as Alfa, Beta, Gamma and Delta. Table 1 shows the number of events generated by each station each day. It is worth mentioning that the Gamma server generates much more events than the rest of the DCs. Moreover, the more events the system generates, more complex is their analysis. That is why the data mining techniques seem a promising approach for this type of data. Table 1. Number of events to be analysed in the experiment Day 1 Day 2 Day 3 Day 4 Total Gamma Beta Delta Alfa Clustering Event Sources In this section we are going to describe the experiments carried out using Windows event logs. For that, we have followed the usual steps suggested in any Data Mining process [13]. 4.1 Learning the application domain The event logs have some special features that have to be taken into account in the clustering process. For that, firstly, the dataset is analyzed, extracting descriptive statistics of each attribute. Statistics only show the number and the percentage of different values for each attribute. Usefulness of each attribute was defined by the distribution of its values. All those attributes where more than 80% of the events belonged to the same value were ruled out. Those attributes that were statistically dependant on any other actions were ruled out too (for instance Message vs EventID). After analyzing the data we realized that although there were 22,369,089 events, the number of different type of events (different EventID-s) was 28. We decided to analyze the events generated by each server, ruling out all the attributes except Workstation name, Event ID, User ID and Timestamp.
5 Title Suppressed Due to Excessive Length Feature Selection The attribute Event ID is the key feature when it comes to carry out the analysis. It means that the statistics that are going to be used as input will be classified based on such a feature. This step of the process is critical and may influence directly the results we obtain. Statistics are proposed as those indicators that might be key to express computer behavior based on security event logs. After analyzing the information the following features were identified in order to cluster sources of Windows logs. 1. Number of total events (num id) 2. Number of different types of events (num diff id) 3. Number of different users (num diff user) 4. Most frequent event (freq event 1 ) 5. Second most frequent event (freq event 2 ) 6. Percentage of events equal to the most frequent event (perc event 1 ) 7. Percentage of event equal to the second most frequent event (perc event 2 ) 8. Most frequent event in the most active second (freq event max sec) 9. Most frequent event in the most active minute (freq event max min) 10. Event of the most largest sequence of the same event (long event id) 11. Length of the most largest sequences of the same event (long event num) 4.3 Application of clustering techniques Once the attributes have been selected, two different clustering processes have been carried out. Clustering of statistic data using K-means. Clustering is a data mining technique which groups similar instances based on the similarities of their attributes. The basic idea is to minimize the distance between the instances of the same cluster and maximize the distance between different clusters. There are many different clustering techniques such as hierarchical clustering or partitional clustering. In our case, the simplest approach (K-means) seems to be enough. One particularity of K-means is that it is necessary to give the number of clusters to discover in advance. In this work, with the aim of obtaining patterns of the different machines, this constant is known, i.e. 4 in our case. K-means technique [14] selects K points as initial centroids of the clusters. Then it assignees all instances to the closest centroid and it re-computes the centroid of each cluster. This process is repeated until the centroids of clusters remain in the same position. We have applied such a technique to the data collected from different events and summarized in Table 1. We know in advance that the first four instances belong to events occurred during four days in the station named Alfa, the following four instances belong to Beta station and so on. The application of the K-means technique on the selected attributes (num id, num diff id and long event num in our case) provided as result four clusters, which match with the four servers analyzed.
6 6 R. Basagoiti et al. Discovering frequent event sequences. So far, we have considered the events as independent events and we have analyzed them from a statistical point of view. The events we are considering in this work are the following ones: 538; User Logoff 540; Successful Network Logon 576; Special privileges assigned to new logon 578; Privileged object operation If we order the events based on their timestamps, we will get a sequence of events, which can be analyzed in different ways. This second approach mainly focuses on the analysis of these 16 different sequences generated by the 4 DCs during 4 days. A sequence of events is a set of nominal symbols which indicates the occurrence of different events. Our work has focused on analysing what events usually interrupt previous events. Let us consider that the system has recorded the following sequence: We could say that in this case, the event 540 (Successful Network Logon) has been interrupted by the event 538 (User Logoff). In that sense, we have considered all the possible interruptions, so that taking into account that we are considering 28 different events, we have generated a 28 x 28 matrix. In that matrix we store how many times an event has interrupted a previous event. Let us consider the example depicted in Figure 2. It means that 2500 times the event 540 has been interrupted by the event 538. Fig. 1. Interruptions matrix The content of such a matrix is represented by means of an array, where the first 28 values define the interruptions of the first event (in this case the event 538 User Logoff). Thus, the first value will mean how many times the 538 event is interrupted by itself (we will consider as 0), the second one how many times it is interrupted by the event 540 (Successful Network Logon), and so on.
7 Title Suppressed Due to Excessive Length 7 After representing such values in an array, we depicted them in graphics where the graphic Alfa1 shows the interruptions for the Alfa server in the first day, Alfa2 shows the interruptions of the same server in the second day and so on. The following pictures show the series obtained for the stations Alfa and Beta in the first two days. Fig. 2. Day 1 and 2 of Alfa server Fig. 3. Day 1 and 2 of Beta server Looking at the figures we realized that the results for a particular server in different days were very similar. Moreover, the dissimilarities with the rest of the servers could facilitate the clustering process. Thus, taking as starting point the 16 series (Alfa1, Alfa2, Alfa3, Alfa4, Beta1, Beta2, Beta3, Beta4, Gamma1, Gamma2, Gamma3, Gamma4, Delta1, Delta2, Delta3, Delta4) we carried out a clustering process using again the K-means technique. In order to compare and therefore cluster the interruptions, we will need criteria to measure the similarity. Let us consider these two set of interruptions X and Y:
8 8 R. Basagoiti et al. X = X 1, X 2, X 3,...X n (1) Y = Y 1, Y 2, Y 3,...Y n (2) Similarity between sets of interruptions will be given by the Manhattan distance between them D (X,Y): D(X, Y ) = n X i Y i (3) i=1 Table 2 shows the results of the clustering process. 15 out of 16 series were well classified, misclassifying only one series of the Gamma DC. Table 2. Clustering of frequent event sequences Series number Name of the series Assigned Cluster 1 Alfa Alfa Alfa Alfa Beta Beta Beta Beta Gamma Gamma Gamma Gamma Delta Delta Delta Delta Conclusions and ongoing challenges Discovering frequent patterns in event logs is the first step to detect unusual behavior or anomalies. Besides proving that it is possible to detect patterns in event logs, different experiments have shown that different servers have different patterns and they can be found out and identified, even in Windows systems. Thus, the experiments carried out at different stages have proved that the same server has very similar patterns during different days. In that sense, these
9 Title Suppressed Due to Excessive Length 9 experiments have been carried out with few Domain Controllers, so that it would be interesting to validate it with a larger set of servers and workstations. Finally, it is worth to say that these results are work in progress that aims to detect anomalies in security event logs out of analyzing the event sources. References 1. Anderson, J.P.: Computer Security Threat Monitoring and Surveillance. Technical report, Fort Washington (1980) 2. Denning, D. E.: An Intrusion-Detection Model. IEEE transaction on Software Engineering, 13(2): (1987) 3. Teng, H., Chen, K., Lu, S.: Adaptive real-time anomaly detection using inductively generated sequential patterns. Proceedings of 1990 IEEE Computer Society Symposium on Research in Security and Privacy, Oakland, California, pp , May 7-9, (1990) 4. Debar, H., Becker, M., Siboni,D.: A Neural Network Component for an Intrusion DetectionSystem. Proceedings, IEEE Symposium on Research in Computer Security and Privacy, pp , (1992) 5. Endler, D.: Intrusion detection: Applying machine learning to solaris audit data. In Proceedings of the 1998 Annual Computer Security Applications Conference (ACSAC 98), pages , Los Alamitos, CA, December IEEE Computer Society, IEEE Computer Society Press. Scottsdale, AZ, (1998) 6. Lee, W., Stolfo, S.: Data Mining Approaches for Intrusion Detection. In Proceedings of the Seventh USENIX Security Symposium (SECURITY 98), San Antonio, TX, January (1998) 7. Lane, T., Brodley, C.E.: Temporal Sequence Learning and Data Reduction for Anomaly Detection. ACM Transactions on Information and System Security, 2: , (1999) 8. Larosa, C., Xiong, L., Mandelberg, K.: Frequent pattern mining for kernel trace data. SAC 08: Proceedings of the 2008 ACM symposium on Applied computing, pp , Brazil, (2008) 9. Rana, A.Z., Bell, J.: Using event attribute name-value pairs for summarizing log data, AusCERT2007 (2007) 10. Vaarandi, R.: Mining Event Logs with SLCT and LogHound, Proceedings of the 2008 IEEE/IFIP Network Operations and Management Symposium, pp , (2008) 11. Viinikka, J.: Time series modeling for IDS Alert Management, ACM ASIAN Symposium on Information, (2006) 12. Burdick, D., Calimlim, M., Gehrke, J.: A maximal frequent itemset algorithm for transactional databases, IEEE Trans. Knowl. Data Eng. 17(11): (2005) 13. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM, 39(11):27-34, (1996). 14. MacQueen, J. B.: Some Methods for classification and Analysis of Multivariate Observations. In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. 1: , University of California Press. (1967)
Hybrid Feature Selection for Modeling Intrusion Detection Systems
Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,
More informationIntrusion Detection by Combining and Clustering Diverse Monitor Data
Intrusion Detection by Combining and Clustering Diverse Monitor Data TSS/ACC Seminar April 5, 26 Atul Bohara and Uttam Thakore PI: Bill Sanders Outline Motivation Overview of the approach Feature extraction
More informationSecurity Audit Trail Analysis Using Inductively Generated Predictive Rules
Security Audit Trail Analysis Using Inductively Generated Predictive Rules Henry S. Teng Kaihu Chen Stephen C-Y Lu 290 Donald-Lynch Blvd. Applied Intelligent Systems Group Knowledge-Based Engineering Systems
More informationReview on Data Mining Techniques for Intrusion Detection System
Review on Data Mining Techniques for Intrusion Detection System Sandeep D 1, M. S. Chaudhari 2 Research Scholar, Dept. of Computer Science, P.B.C.E, Nagpur, India 1 HoD, Dept. of Computer Science, P.B.C.E,
More information9. Conclusions. 9.1 Definition KDD
9. Conclusions Contents of this Chapter 9.1 Course review 9.2 State-of-the-art in KDD 9.3 KDD challenges SFU, CMPT 740, 03-3, Martin Ester 419 9.1 Definition KDD [Fayyad, Piatetsky-Shapiro & Smyth 96]
More informationThanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently New challenges: with a
Data Mining and Information Retrieval Introduction to Data Mining Why Data Mining? Thanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently
More informationA Two Stage Zone Regression Method for Global Characterization of a Project Database
A Two Stage Zone Regression Method for Global Characterization 1 Chapter I A Two Stage Zone Regression Method for Global Characterization of a Project Database J. J. Dolado, University of the Basque Country,
More informationPattern Mining. Knowledge Discovery and Data Mining 1. Roman Kern KTI, TU Graz. Roman Kern (KTI, TU Graz) Pattern Mining / 42
Pattern Mining Knowledge Discovery and Data Mining 1 Roman Kern KTI, TU Graz 2016-01-14 Roman Kern (KTI, TU Graz) Pattern Mining 2016-01-14 1 / 42 Outline 1 Introduction 2 Apriori Algorithm 3 FP-Growth
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 7, July-2013 ISSN
1 Review: Boosting Classifiers For Intrusion Detection Richa Rawat, Anurag Jain ABSTRACT Network and host intrusion detection systems monitor malicious activities and the management station is a technique
More informationA Network Intrusion Detection System Architecture Based on Snort and. Computational Intelligence
2nd International Conference on Electronics, Network and Computer Engineering (ICENCE 206) A Network Intrusion Detection System Architecture Based on Snort and Computational Intelligence Tao Liu, a, Da
More informationFrequent Pattern Mining for Kernel Trace Data
Frequent Pattern Mining for Kernel Trace Data Christopher LaRosa, Li Xiong, Ken Mandelberg Department of Mathematics and Computer Science Emory University, Atlanta, GA 30322 +1 404-727-7580 {clarosa,lxiong,km}@mathcs.emory.edu
More informationPerformance Analysis of Data Mining Classification Techniques
Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal
More informationMethods for Detecting Important Events and Knowledge from Data Security Logs Risto Vaarandi CCD COE, Tallinn, Estonia
Methods for Detecting Important Events and Knowledge from Data Security Logs Risto Vaarandi CCD COE, Tallinn, Estonia risto.vaarandi@ccdcoe.org Abstract: In modern computer networks and IT systems, event
More informationDynamic Clustering of Data with Modified K-Means Algorithm
2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq
More informationAn Improved Apriori Algorithm for Association Rules
Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan
More informationA multi-step attack-correlation method with privacy protection
A multi-step attack-correlation method with privacy protection Research paper A multi-step attack-correlation method with privacy protection ZHANG Yongtang 1, 2, LUO Xianlu 1, LUO Haibo 1 1. Department
More informationA Rule-Based Intrusion Alert Correlation System for Integrated Security Management *
A Rule-Based Intrusion Correlation System for Integrated Security Management * Seong-Ho Lee 1, Hyung-Hyo Lee 2, and Bong-Nam Noh 1 1 Department of Computer Science, Chonnam National University, Gwangju,
More informationAn advanced data leakage detection system analyzing relations between data leak activity
An advanced data leakage detection system analyzing relations between data leak activity Min-Ji Seo 1 Ph. D. Student, Software Convergence Department, Soongsil University, Seoul, 156-743, Korea. 1 Orcid
More informationKeywords Fuzzy, Set Theory, KDD, Data Base, Transformed Database.
Volume 6, Issue 5, May 016 ISSN: 77 18X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Fuzzy Logic in Online
More informationFUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION. Abstract
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION Susan M. Bridges, Associate Professor Rayford B. Vaughn, Associate Professor Department of Computer Science Mississippi State University
More informationAN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION
AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION WILLIAM ROBSON SCHWARTZ University of Maryland, Department of Computer Science College Park, MD, USA, 20742-327, schwartz@cs.umd.edu RICARDO
More informationReduce convention for Large Data Base Using Mathematical Progression
Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 12, Number 4 (2016), pp. 3577-3584 Research India Publications http://www.ripublication.com/gjpam.htm Reduce convention for Large Data
More informationMining Frequent Patterns with Counting Inference at Multiple Levels
International Journal of Computer Applications (097 7) Volume 3 No.10, July 010 Mining Frequent Patterns with Counting Inference at Multiple Levels Mittar Vishav Deptt. Of IT M.M.University, Mullana Ruchika
More informationCustomer Clustering using RFM analysis
Customer Clustering using RFM analysis VASILIS AGGELIS WINBANK PIRAEUS BANK Athens GREECE AggelisV@winbank.gr DIMITRIS CHRISTODOULAKIS Computer Engineering and Informatics Department University of Patras
More informationTechnical Aspects of Intrusion Detection Techniques
Technical Aspects of Intrusion Detection Techniques Final Year Project 2003-04 Project Plan Version 0.2 28th, November 2003 By Cheung Lee Man 2001572141 Computer Science and Information Systems Supervisor
More informationA S T U D Y I N U S I N G N E U R A L N E T W O R K S F O R A N O M A L Y A N D M I S U S E D E T E C T I O N
The following paper was originally published in the Proceedings of the 8 th USENIX Security Symposium Washington, D.C., USA, August 23 26, 1999 A S T U D Y I N U S I N G N E U R A L N E T W O R K S F O
More informationApplication of the Generic Feature Selection Measure in Detection of Web Attacks
Application of the Generic Feature Selection Measure in Detection of Web Attacks Hai Thanh Nguyen 1, Carmen Torrano-Gimenez 2, Gonzalo Alvarez 2 Slobodan Petrović 1, and Katrin Franke 1 1 Norwegian Information
More informationCluster Analysis. Angela Montanari and Laura Anderlucci
Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a
More informationData Mining for Improving Intrusion Detection
Data Mining for Improving Intrusion Detection presented by: Dr. Eric Bloedorn Team members: Bill Hill (PI) Dr. Alan Christiansen, Dr. Clem Skorupka, Dr. Lisa Talbot, Jonathan Tivel 12/6/00 Overview Background
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Computer Science 591Y Department of Computer Science University of Massachusetts Amherst February 3, 2005 Topics Tasks (Definition, example, and notes) Classification
More informationarxiv: v1 [cs.db] 7 Dec 2011
Using Taxonomies to Facilitate the Analysis of the Association Rules Marcos Aurélio Domingues 1 and Solange Oliveira Rezende 2 arxiv:1112.1734v1 [cs.db] 7 Dec 2011 1 LIACC-NIAAD Universidade do Porto Rua
More informationThe Forensic Chain-of-Evidence Model: Improving the Process of Evidence Collection in Incident Handling Procedures
The Forensic Chain-of-Evidence Model: Improving the Process of Evidence Collection in Incident Handling Procedures Atif Ahmad Department of Information Systems, University of Melbourne, Parkville, VIC
More informationMcPAD and HMM-Web: two different approaches for the detection of attacks against Web applications
McPAD and HMM-Web: two different approaches for the detection of attacks against Web applications Davide Ariu, Igino Corona, Giorgio Giacinto, Fabio Roli University of Cagliari, Dept. of Electrical and
More informationA trace-driven analysis of disk working set sizes
A trace-driven analysis of disk working set sizes Chris Ruemmler and John Wilkes Operating Systems Research Department Hewlett-Packard Laboratories, Palo Alto, CA HPL OSR 93 23, 5 April 993 Keywords: UNIX,
More informationAn Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm
Proceedings of the National Conference on Recent Trends in Mathematical Computing NCRTMC 13 427 An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm A.Veeraswamy
More informationIntrusion Detection Based On Clustering Algorithm
International Journal of Electronics and Computer Science Engineering 1059 Available Online at www.ijecse.org ISSN- 2277-1956 Intrusion Detection Based On Clustering Algorithm Nadya El MOUSSAID 1, Ahmed
More informationKanban Size and its Effect on JIT Production Systems
Kanban Size and its Effect on JIT Production Systems Ing. Olga MAŘÍKOVÁ 1. INTRODUCTION Integrated planning, formation, carrying out and controlling of tangible and with them connected information flows
More informationDeveloping the Sensor Capability in Cyber Security
Developing the Sensor Capability in Cyber Security Tero Kokkonen, Ph.D. +358504385317 tero.kokkonen@jamk.fi JYVSECTEC JYVSECTEC - Jyväskylä Security Technology - is the cyber security research, development
More informationADAPTIVE NETWORK ANOMALY DETECTION USING BANDWIDTH UTILISATION DATA
1st International Conference on Experiments/Process/System Modeling/Simulation/Optimization 1st IC-EpsMsO Athens, 6-9 July, 2005 IC-EpsMsO ADAPTIVE NETWORK ANOMALY DETECTION USING BANDWIDTH UTILISATION
More informationI. INTRODUCTION II. RELATED WORK.
ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A New Hybridized K-Means Clustering Based Outlier Detection Technique
More informationIntegration of information security and network data mining technology in the era of big data
Acta Technica 62 No. 1A/2017, 157 166 c 2017 Institute of Thermomechanics CAS, v.v.i. Integration of information security and network data mining technology in the era of big data Lu Li 1 Abstract. The
More informationHIPAA Controls. Powered by Auditor Mapping.
HIPAA Controls Powered by Auditor Mapping www.tetherview.com About HIPAA The Health Insurance Portability and Accountability Act (HIPAA) is a set of standards created by Congress that aim to safeguard
More informationK+ Means : An Enhancement Over K-Means Clustering Algorithm
K+ Means : An Enhancement Over K-Means Clustering Algorithm Srikanta Kolay SMS India Pvt. Ltd., RDB Boulevard 5th Floor, Unit-D, Plot No.-K1, Block-EP&GP, Sector-V, Salt Lake, Kolkata-700091, India Email:
More informationPreemptive PREventivE Methodology and Tools to protect utilities
Preemptive PREventivE Methodology and Tools to protect utilities 2014 2017 With the financial support of FP7 Seventh Framework Programme Grant agreement no: 607093 1 Preemptive description Project objectives
More informationIntrusion Detection System
Intrusion Detection System Marmagna Desai March 12, 2004 Abstract This report is meant to understand the need, architecture and approaches adopted for building Intrusion Detection System. In recent years
More informationChapter 28. Outline. Definitions of Data Mining. Data Mining Concepts
Chapter 28 Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms
More informationA study on fuzzy intrusion detection
A study on fuzzy intrusion detection J.T. Yao S.L. Zhao L. V. Saxton Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: [jtyao,zhao200s,saxton]@cs.uregina.ca
More informationA mining method for tracking changes in temporal association rules from an encoded database
A mining method for tracking changes in temporal association rules from an encoded database Chelliah Balasubramanian *, Karuppaswamy Duraiswamy ** K.S.Rangasamy College of Technology, Tiruchengode, Tamil
More informationA New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering
A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering Nghiem Van Tinh 1, Vu Viet Vu 1, Tran Thi Ngoc Linh 1 1 Thai Nguyen University of
More informationData Mining: An experimental approach with WEKA on UCI Dataset
Data Mining: An experimental approach with WEKA on UCI Dataset Ajay Kumar Dept. of computer science Shivaji College University of Delhi, India Indranath Chatterjee Dept. of computer science Faculty of
More informationAnalysis of Dendrogram Tree for Identifying and Visualizing Trends in Multi-attribute Transactional Data
Analysis of Dendrogram Tree for Identifying and Visualizing Trends in Multi-attribute Transactional Data D.Radha Rani 1, A.Vini Bharati 2, P.Lakshmi Durga Madhuri 3, M.Phaneendra Babu 4, A.Sravani 5 Department
More informationRedefining and Enhancing K-means Algorithm
Redefining and Enhancing K-means Algorithm Nimrat Kaur Sidhu 1, Rajneet kaur 2 Research Scholar, Department of Computer Science Engineering, SGGSWU, Fatehgarh Sahib, Punjab, India 1 Assistant Professor,
More informationRecommendation System Using Yelp Data CS 229 Machine Learning Jia Le Xu, Yingran Xu
Recommendation System Using Yelp Data CS 229 Machine Learning Jia Le Xu, Yingran Xu 1 Introduction Yelp Dataset Challenge provides a large number of user, business and review data which can be used for
More informationLast time. Security Policies and Models. Trusted Operating System Design. Bell La-Padula and Biba Security Models Information Flow Control
Last time Security Policies and Models Bell La-Padula and Biba Security Models Information Flow Control Trusted Operating System Design Design Elements Security Features 10-1 This time Trusted Operating
More informationInformation mining and information retrieval : methods and applications
Information mining and information retrieval : methods and applications J. Mothe, C. Chrisment Institut de Recherche en Informatique de Toulouse Université Paul Sabatier, 118 Route de Narbonne, 31062 Toulouse
More informationIndexing in Search Engines based on Pipelining Architecture using Single Link HAC
Indexing in Search Engines based on Pipelining Architecture using Single Link HAC Anuradha Tyagi S. V. Subharti University Haridwar Bypass Road NH-58, Meerut, India ABSTRACT Search on the web is a daily
More information732A54/TDDE31 Big Data Analytics
732A54/TDDE31 Big Data Analytics Lecture 10: Machine Learning with MapReduce Jose M. Peña IDA, Linköping University, Sweden 1/27 Contents MapReduce Framework Machine Learning with MapReduce Neural Networks
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining
More informationFrom Data to Actionable Knowledge: Applying Data Mining to the Problem of Intrusion Detection
From Data to Actionable Knowledge: Applying Data Mining to the Problem of Intrusion Detection Terrance Goan Stottler Henke Associates Inc. 1107 NE 45th St. Seattle, WA 98105 Phone: 206-545-1478 Fax: 206-545-7227
More informationTime Series Clustering: A Superior Alternative for Market Basket Analysis
Time Series Clustering: A Superior Alternative for Market Basket Analysis Swee Chuan Tan, Jess Pei San Lau SIM University, School of Business 535A Clementi Road, Singapore {jamestansc, pslau002}@unisim.edu.sg
More informationMining Frequent Itemsets for data streams over Weighted Sliding Windows
Mining Frequent Itemsets for data streams over Weighted Sliding Windows Pauray S.M. Tsai Yao-Ming Chen Department of Computer Science and Information Engineering Minghsin University of Science and Technology
More informationA Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition
A Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition S.Vigneswaran 1, M.Yashothai 2 1 Research Scholar (SRF), Anna University, Chennai.
More informationNETWORK FAULT DETECTION - A CASE FOR DATA MINING
NETWORK FAULT DETECTION - A CASE FOR DATA MINING Poonam Chaudhary & Vikram Singh Department of Computer Science Ch. Devi Lal University, Sirsa ABSTRACT: Parts of the general network fault management problem,
More informationMethod for security monitoring and special filtering traffic mode in info communication systems
Method for security monitoring and special filtering traffic mode in info communication systems Sherzod Rajaboyevich Gulomov Provide Information Security department Tashkent University of Information Technologies
More informationAn Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining
An Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining P.Subhashini 1, Dr.G.Gunasekaran 2 Research Scholar, Dept. of Information Technology, St.Peter s University,
More informationSoftware Architecture Recovery based on Dynamic Analysis
Software Architecture Recovery based on Dynamic Analysis Aline Vasconcelos 1,2, Cláudia Werner 1 1 COPPE/UFRJ System Engineering and Computer Science Program P.O. Box 68511 ZIP 21945-970 Rio de Janeiro
More informationKBSVM: KMeans-based SVM for Business Intelligence
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2004 Proceedings Americas Conference on Information Systems (AMCIS) December 2004 KBSVM: KMeans-based SVM for Business Intelligence
More informationEffective Intrusion Type Identification with Edit Distance for HMM-Based Anomaly Detection System
Effective Intrusion Type Identification with Edit Distance for HMM-Based Anomaly Detection System Ja-Min Koo and Sung-Bae Cho Dept. of Computer Science, Yonsei University, Shinchon-dong, Seodaemoon-ku,
More informationINTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá
INTRODUCTION TO DATA MINING Daniel Rodríguez, University of Alcalá Outline Knowledge Discovery in Datasets Model Representation Types of models Supervised Unsupervised Evaluation (Acknowledgement: Jesús
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining
More informationFM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data
FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data Qiankun Zhao Nanyang Technological University, Singapore and Sourav S. Bhowmick Nanyang Technological University,
More informationFlowzilla: A Methodology for Detecting Data Transfer Anomalies in Research Networks. Anna Giannakou, Daniel Gunter, Sean Peisert
Flowzilla: A Methodology for Detecting Data Transfer Anomalies in Research Networks Anna Giannakou, Daniel Gunter, Sean Peisert Research Networks Scientific applications that process large amounts of data
More informationTemporal Weighted Association Rule Mining for Classification
Temporal Weighted Association Rule Mining for Classification Purushottam Sharma and Kanak Saxena Abstract There are so many important techniques towards finding the association rules. But, when we consider
More information1. INTRODUCTION. AMS Subject Classification. 68U10 Image Processing
ANALYSING THE NOISE SENSITIVITY OF SKELETONIZATION ALGORITHMS Attila Fazekas and András Hajdu Lajos Kossuth University 4010, Debrecen PO Box 12, Hungary Abstract. Many skeletonization algorithms have been
More informationMeans for Intrusion Detection. Intrusion Detection. INFO404 - Lecture 13. Content
Intrusion Detection INFO404 - Lecture 13 21.04.2009 nfoukia@infoscience.otago.ac.nz Content Definition Network vs. Host IDS Misuse vs. Behavior Based IDS Means for Intrusion Detection Definitions (1) Intrusion:
More informationMining of Web Server Logs using Extended Apriori Algorithm
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational
More informationON HANDLING REPLAY ATTACKS IN INTRUSION DETECTION SYSTEMS A. M. Sokolov, D. A. Rachkovskij
International Journal "Information Theories & Applications" Vol.10 341 ON HANDLING REPLAY ATTACKS IN INTRUSION DETECTION SYSTEMS A. M. Sokolov, D. A. Rachkovskij Abstract: We propose a method for detecting
More informationClustering Documents in Large Text Corpora
Clustering Documents in Large Text Corpora Bin He Faculty of Computer Science Dalhousie University Halifax, Canada B3H 1W5 bhe@cs.dal.ca http://www.cs.dal.ca/ bhe Yongzheng Zhang Faculty of Computer Science
More informationThe Application of K-medoids and PAM to the Clustering of Rules
The Application of K-medoids and PAM to the Clustering of Rules A. P. Reynolds, G. Richards, and V. J. Rayward-Smith School of Computing Sciences, University of East Anglia, Norwich Abstract. Earlier research
More informationChange Analysis in Spatial Data by Combining Contouring Algorithms with Supervised Density Functions
Change Analysis in Spatial Data by Combining Contouring Algorithms with Supervised Density Functions Chun Sheng Chen 1, Vadeerat Rinsurongkawong 1, Christoph F. Eick 1, and Michael D. Twa 2 1 Department
More informationAn Apriori-like algorithm for Extracting Fuzzy Association Rules between Keyphrases in Text Documents
An Apriori-lie algorithm for Extracting Fuzzy Association Rules between Keyphrases in Text Documents Guy Danon Department of Information Systems Engineering Ben-Gurion University of the Negev Beer-Sheva
More informationDIVERSITY-BASED INTERESTINGNESS MEASURES FOR ASSOCIATION RULE MINING
DIVERSITY-BASED INTERESTINGNESS MEASURES FOR ASSOCIATION RULE MINING Huebner, Richard A. Norwich University rhuebner@norwich.edu ABSTRACT Association rule interestingness measures are used to help select
More informationCS Review. Prof. Clarkson Spring 2017
CS 5430 Review Prof. Clarkson Spring 2017 Recall: Audit logs Recording: what to log what not to log how to log locally remotely how to protect the log Reviewing: manual exploration automated analysis MANUAL
More informationA SYSTEM FOR DETECTION AND PRVENTION OF PATH BASED DENIAL OF SERVICE ATTACK
A SYSTEM FOR DETECTION AND PRVENTION OF PATH BASED DENIAL OF SERVICE ATTACK P.Priya 1, S.Tamilvanan 2 1 M.E-Computer Science and Engineering Student, Bharathidasan Engineering College, Nattrampalli. 2
More informationAn Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data
An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data Nian Zhang and Lara Thompson Department of Electrical and Computer Engineering, University
More informationOptimal Clustering and Statistical Identification of Defective ICs using I DDQ Testing
Optimal Clustering and Statistical Identification of Defective ICs using I DDQ Testing A. Rao +, A.P. Jayasumana * and Y.K. Malaiya* *Colorado State University, Fort Collins, CO 8523 + PalmChip Corporation,
More informationDenial of Service (DoS) Attack Detection by Using Fuzzy Logic over Network Flows
Denial of Service (DoS) Attack Detection by Using Fuzzy Logic over Network Flows S. Farzaneh Tabatabaei 1, Mazleena Salleh 2, MohammadReza Abbasy 3 and MohammadReza NajafTorkaman 4 Faculty of Computer
More informationStatistical Databases: Query Restriction
Statistical Databases: Query Restriction Nina Mishra January 21, 2004 Introduction A statistical database typically contains information about n individuals where n is very large. A statistical database
More informationThe Comparison of CBA Algorithm and CBS Algorithm for Meteorological Data Classification Mohammad Iqbal, Imam Mukhlash, Hanim Maria Astuti
Information Systems International Conference (ISICO), 2 4 December 2013 The Comparison of CBA Algorithm and CBS Algorithm for Meteorological Data Classification Mohammad Iqbal, Imam Mukhlash, Hanim Maria
More informationCYSE 411/AIT 681 Secure Software Engineering Topic #3. Risk Management
CYSE 411/AIT 681 Secure Software Engineering Topic #3. Risk Management Instructor: Dr. Kun Sun Outline 1. Risk management 2. Standards on Evaluating Secure System 3. Security Analysis using Security Metrics
More informationDMSA TECHNIQUE FOR FINDING SIGNIFICANT PATTERNS IN LARGE DATABASE
DMSA TECHNIQUE FOR FINDING SIGNIFICANT PATTERNS IN LARGE DATABASE Saravanan.Suba Assistant Professor of Computer Science Kamarajar Government Art & Science College Surandai, TN, India-627859 Email:saravanansuba@rediffmail.com
More informationStandard: Event Monitoring
October 24, 2016 Page 1 Contents Revision History... 4 Executive Summary... 4 Introduction and Purpose... 5 Scope... 5 Standard... 5 Audit Log Standard: Nature of Information and Retention Period... 5
More informationThe Application of Artificial Neural Networks to Misuse Detection: Initial Results
The Application of Artificial Neural Networks to Misuse Detection: Initial Results James Cannady Georgia Tech Research Institute Georgia Institute of Technology Atlanta, GA 30332 james.cannady@gtri.gatech.edu
More informationIntrusion Detection in Containerized Environments
San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2018 Intrusion Detection in Containerized Environments Shyam Sundar Durairaju San Jose State University
More informationGFI EventsManager 8 ReportPack. Manual. By GFI Software Ltd.
GFI EventsManager 8 ReportPack Manual By GFI Software Ltd. http://www.gfi.com E-Mail: info@gfi.com Information in this document is subject to change without notice. Companies, names, and data used in examples
More informationGraph Based Approach for Finding Frequent Itemsets to Discover Association Rules
Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules Manju Department of Computer Engg. CDL Govt. Polytechnic Education Society Nathusari Chopta, Sirsa Abstract The discovery
More informationOptimized Intrusion Detection by CACC Discretization Via Naïve Bayes and K-Means Clustering
54 Optimized Intrusion Detection by CACC Discretization Via Naïve Bayes and K-Means Clustering Vineet Richhariya, Nupur Sharma 1 Lakshmi Narain College of Technology, Bhopal, India Abstract Network Intrusion
More informationA Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA)
International Journal of Innovation and Scientific Research ISSN 2351-8014 Vol. 12 No. 1 Nov. 2014, pp. 217-222 2014 Innovative Space of Scientific Research Journals http://www.ijisr.issr-journals.org/
More information6. Dicretization methods 6.1 The purpose of discretization
6. Dicretization methods 6.1 The purpose of discretization Often data are given in the form of continuous values. If their number is huge, model building for such data can be difficult. Moreover, many
More informationINTRUSION DETECTION SYSTEM USING BIG DATA FRAMEWORK
INTRUSION DETECTION SYSTEM USING BIG DATA FRAMEWORK Abinesh Kamal K. U. and Shiju Sathyadevan Amrita Center for Cyber Security Systems and Networks, Amrita School of Engineering, Amritapuri, Amrita Vishwa
More information