Qualitative classification and evaluation in possibilistic decision trees

Similar documents
Improving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique

Improving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique

Weighted Attacks in Argumentation Frameworks

FUZZY BOOLEAN ALGEBRAS AND LUKASIEWICZ LOGIC. Angel Garrido

Mining High Order Decision Rules

Fuzzy Partitioning with FID3.1

RPKM: The Rough Possibilistic K-Modes

Av. Prof. Mello Moraes, 2231, , São Paulo, SP - Brazil

Efficient SQL-Querying Method for Data Mining in Large Data Bases

Adaptive Building of Decision Trees by Reinforcement Learning

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai

Fuzzy Set-Theoretical Approach for Comparing Objects with Fuzzy Attributes

A reasoning model based on the production of acceptable. arguments

Approximate Discrete Probability Distribution Representation using a Multi-Resolution Binary Tree

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Softening Splits in Decision Trees Using Simulated Annealing

Escola Politécnica, University of São Paulo Av. Prof. Mello Moraes, 2231, , São Paulo, SP - Brazil

AN APPROXIMATION APPROACH FOR RANKING FUZZY NUMBERS BASED ON WEIGHTED INTERVAL - VALUE 1.INTRODUCTION

Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers

A Well-Behaved Algorithm for Simulating Dependence Structures of Bayesian Networks

PRIVACY-PRESERVING MULTI-PARTY DECISION TREE INDUCTION

Look-Ahead Based Fuzzy Decision Tree Induction

Fuzzy Partitioning Using Mathematical Morphology in a Learning Scheme

ROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING WITH UNCERTAINTY

AN APPROXIMATE POSSIBILISTIC GRAPHICAL MODEL FOR COMPUTING OPTIMISTIC QUALITATIVE DECISION

MAX-MIN FAIRNESS is a simple, well-recognized

Solution of m 3 or 3 n Rectangular Interval Games using Graphical Method

Computing Preferred Extensions for Argumentation Systems with Sets of Attacking Arguments

Cost Minimization Fuzzy Assignment Problem applying Linguistic Variables

An Information-Theoretic Approach to the Prepruning of Classification Rules

Intrusion detection in computer networks through a hybrid approach of data mining and decision trees

Repairing Preference-Based Argumentation Frameworks

Solution to Graded Problem Set 4

ANALYSIS AND REASONING OF DATA IN THE DATABASE USING FUZZY SYSTEM MODELLING

A new parameterless credal method to track-to-track assignment problem

Fuzzy Analogy: A New Approach for Software Cost Estimation

A more efficient algorithm for perfect sorting by reversals

International Journal of Software and Web Sciences (IJSWS)

Implementation of Classification Rules using Oracle PL/SQL

Formal Model. Figure 1: The target concept T is a subset of the concept S = [0, 1]. The search agent needs to search S for a point in T.

Logistic Model Tree With Modified AIC

A Two Stage Zone Regression Method for Global Characterization of a Project Database

Testing Independencies in Bayesian Networks with i-separation

A Compromise Solution to Multi Objective Fuzzy Assignment Problem

On JAM of Triangular Fuzzy Number Matrices

Induction of Strong Feature Subsets

FUNDAMENTALS OF FUZZY SETS

On the Complexity of the Policy Improvement Algorithm. for Markov Decision Processes

Belief Hierarchical Clustering

A probabilistic description-oriented approach for categorising Web documents

Estimating Feature Discriminant Power in Decision Tree Classifiers*

Arnab Bhattacharya, Shrikant Awate. Dept. of Computer Science and Engineering, Indian Institute of Technology, Kanpur

Contextual Co-occurrence Information for Object Representation and Categorization

Hybrid Approach for Classification using Support Vector Machine and Decision Tree

DECISION TREE INDUCTION USING ROUGH SET THEORY COMPARATIVE STUDY

Bayesian Learning Networks Approach to Cybercrime Detection

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM

Optimal Variable Length Codes (Arbitrary Symbol Cost and Equal Code Word Probability)* BEN VARN

Data Mining. Decision Tree. Hamid Beigy. Sharif University of Technology. Fall 1396

Feature-weighted k-nearest Neighbor Classifier

A Constraint Programming Based Approach to Detect Ontology Inconsistencies

Fuzzy Ant Clustering by Centroid Positioning

VALCSP solver : a combination of Multi-Level Dynamic Variable Ordering with Constraint Weighting

SOME OPERATIONS ON INTUITIONISTIC FUZZY SETS

FUZZY INFERENCE SYSTEMS

Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques

Acyclic fuzzy preferences and the Orlovsky choice function: A note. Denis BOUYSSOU

Using level-2 fuzzy sets to combine uncertainty and imprecision in fuzzy regions

FACILITY LIFE-CYCLE COST ANALYSIS BASED ON FUZZY SETS THEORY Life-cycle cost analysis

Lecture 2 :: Decision Trees Learning

CSC 373: Algorithm Design and Analysis Lecture 8

Solution of Rectangular Interval Games Using Graphical Method

Post-processing the hybrid method for addressing uncertainty in risk assessments. Technical Note for the Journal of Environmental Engineering

FuzzyDT- A Fuzzy Decision Tree Algorithm Based on C4.5

Coefficient of Variation based Decision Tree (CvDT)

Review of Fuzzy Logical Database Models

Generalized Implicative Model of a Fuzzy Rule Base and its Properties

Generating Optimized Decision Tree Based on Discrete Wavelet Transform Kiran Kumar Reddi* 1 Ali Mirza Mahmood 2 K.

A note on the pairwise Markov condition in directed Markov fields

Bipolar Fuzzy Line Graph of a Bipolar Fuzzy Hypergraph

Development of Prediction Model for Linked Data based on the Decision Tree for Track A, Task A1

c 2004 Society for Industrial and Applied Mathematics

Supervised Variable Clustering for Classification of NIR Spectra

Loopy Belief Propagation

Evolutionary Decision Trees and Software Metrics for Module Defects Identification

Fuzzy interpolation and level 2 gradual rules

A NEW MULTI-CRITERIA EVALUATION MODEL BASED ON THE COMBINATION OF NON-ADDITIVE FUZZY AHP, CHOQUET INTEGRAL AND SUGENO λ-measure

Exact Optimal Solution of Fuzzy Critical Path Problems

Combining Qualitative and Quantitative Knowledge to Generate Models of Physical Systems

Exemplar Learning in Fuzzy Decision Trees

Efficient Case Based Feature Construction

7. Decision or classification trees

THE FOUNDATIONS OF MATHEMATICS

Structured System Theory

Similarity Measures of Pentagonal Fuzzy Numbers

Biology Project 1

Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data

Classification and Regression Trees

Fuzzy Transportation Problems with New Kind of Ranking Function

A framework for fuzzy models of multiple-criteria evaluation

Transcription:

Qualitative classification evaluation in possibilistic decision trees Nahla Ben Amor Institut Supérieur de Gestion de Tunis, 41 Avenue de la liberté, 2000 Le Bardo, Tunis, Tunisia E-mail: nahlabenamor@gmxfr Salem Benferhat CRIL - CNRS, Université d Artois, Rue Jean Souvraz SP 18 62307 Lens, Cedex France E-mail: benferhat@criluniv-artoisfr Zied Elouedi Institut Supérieur de Gestion de Tunis, 41 Avenue de la liberté, 2000 Le Bardo, Tunis, Tunisia E-mail: ziedelouedi@gmxfr Abstract This paper presents a method for classifying objects in an uncertain context using decision trees Uncertainty is related to attributes values of objects to classify is hled in a qualititative possibilistic framework Then, an evaluation method to judge the classification efficiency, in an uncertain context, is proposed I INTRODUCTION Decision trees are efficient methods used in classification problems They consist of decision nodes for testing attributes, edges for branching attribute values leaves for labeling classes [9], [7] The decision tree technique is composed of two major procedures [2], [11]: 1) Building the tree: A decision tree can be built based on a given training set It consists in finding for each decision node the appropriate test attribute by using an attribute selection measure also to define the class labeling each leaf satisfying one of the stopping criteria 2) Classifying objects: We start by the root of the decision tree, then we test the attribute specified by this node According to the result of the test, we move down the tree branch relative to the attribute value of the given object This process will be repeated until a leaf is encountered This leaf is labeled by a class As pointed out in several researches [1], [4], [5], [6], [10], the classical methods of induction of decision trees do not deal with uncertain data Ignoring it can affect the value of the results of classification In order to adapt decision trees to uncertainty imprecision, we first propose different manners to classify objects with uncertain/missing attributes using qualitative possibility theory Then, we propose a criterion allowing to judge the efficiency of the classifier in an uncertain context We illustrate our approach with a same running example from an intrusion detection system area The paper will be organized as follows: Section 2 presents an overview of the possibility theory Section 3 recalls the basics of possibilistic decision trees In Section 4, we describe our leximin/leximax classification in possibilistic decision trees In Section 5, the evaluation of the classification efficiency of possibilistic decision trees will be detailed II POSSIBILITY THEORY This Section gives a brief recalling on possibility theory (for more details see [3]) Uncertainty is here assumed to be represented qualitatively by a finite totally ordered scale denoted by such that If is a set of uncertainty degrees, we define (resp ) such that such that (resp ) The basic concept of possibility theory, when uncertainty is represented qualitatively, is the notion of Qualitative Possibility Distribution (QPD), simply denoted by A QPD is a function which associates to each element of the universe of discourse an element from, ( encodes our beliefs on a real world) By convention, means that it is completely possible is the real world, means that cannot be the real world, means that is at least as possible as to be the real world A QPD is said to be normalized if there exists at least one state which is totally possible (ie ) We define the possibility measure of any event by: This measure evaluates at which level our knowledge represented by III POSSIBILISTIC DECISION TREES (1) is consistent with In this section, we do not detail the construction of decision trees which is based on a given training set where attribute values classes are defined precisely (for more details see [11]) We are rather interested on how to classify objects characterized by uncertain attributes values where the uncertainty is presented by qualitative possibility distributions We assign for each attribute a possibility distribution expressing the uncertainty in a qualitative way by encoding it in the interval Let be different attributes of the problem The instance to classify is described by a vector of possibility distributions An attribute is precisely

defined if there exists exactly one value such that, for all other values A missing data regarding an attribute, is represented by a uniform possibility distribution (ie, In stard possibility theory, the basic operators min/max are used in order to choose the more plausible path in the tree At first, we should compute the possibility degree of each path (from a root to a leaf class) by applying the minimum operator on its attributes values Then, the most plausible path is the one presenting the highest possibility degree, in other words, we should apply the maximum operator on these paths degrees Hence the class of the object to classify is the one labeling the leaf corresponding to this path Example 1: In order to illustrate the different notions presented in this paper, we will consider an example in the intrusion detection field where we hle formatted connections corresponding to a TCP/IP dump rows Note that, for the sake of simplicity, each connection is described by only four attributes which are: service, flag, count, wrong fragment The attributes are defined as follows: - We also hle three classes: ; where Normal corresponds to a normal connection while DOS Probing are relative to two categories of attacks N Fig 1 Service http domain u private count P SF flag REJ RSTO wrong fragment count wrong fragment P N D D N P N Example of decision tree in intrusion detection field Assume that the connection to classify is with the possibility distributions given in Table I According to the decision tree (see Figure 1) we have nine paths, then applying the minimum operator on the different degrees relative to each path, we get: TABLE I POSSIBILITY DISTRIBUTIONS ON http 1 SF 1 domain u, 1 REJ private 1 RSTO 1 According to the maximum operator, the most plausible paths are 3 9, thus the connection will be classified as Probing or DOS with a possibility degree 1 It is clear that the use of the maximum operator makes it difficult to choose between the equally plausible paths IV LEXIMIN/LEXIMAX CLASSIFICATION IN POSSIBILISTIC DECISION TREES The min-max combination mode is not satisfactory since it is somewhat cautious which makes the number of cidate classes important especially when the number of missing attributes is important Furthermore, min/max operators are not discriminatory Indeed, we can check that, for any attribute, for any value of such that, replacing by does not change the selected cidate classes This is explained by the fact that if are normalized, then there exists at least one path from the root to a leaf class such that the possibility degree of each node s value in this path is equal to 1 Hence, with min/max combination mode, only paths where possibility degrees of attributes values are equal to 1 are considered One idea to overcome drawbacks of the min/max combination is to extend these two operators by using leximin leximax criteria which are natural extensions of the minimum maximum operators used in the qualitative

setting [8] defined by: Definition 1:!!expliquer be two vectors, let be two permutations of indices such that Then, is said to be leximin-preferred (resp leximaxpreferred) to, denoted by (resp ), if only if there exists such that (resp ) is said to be leximin-equal (resp leximax-equal) to, denoted by (resp ), if only if, Let be the set of all different paths from the root to leaves For each class, we associate a vector containing paths having as a leaf classified in a leximinorder To apply this criterion, all paths should be described by the same attributes already defined in the training set However, since paths are pruned, the idea will be to assign a degree 1 to the missing values The justification of adding the degree 1 can be explained as follows: In some path, a class is obtained without an attribute, this in fact means that can be obtained independently of the value of In other terms, can be obtained from a path composed by the most plausible instance of (namely those having the degree 1 since only normalized distributions are considered) Definition 2: Let be two vectors relative to paths leading to Let be two permutations of indices such that is said to be leximin-leximax preferred to, denoted by, - if there exists such that - or if, (ie is supported by a greater number of paths than ) is said to be leximin-leximax equal to, denoted by, if only if, Definition 3: Let be a set of classes, the class is leximin-leximax preferred iff there is no class, such that The selection mode based on the leximin/leximax operators proceeds in two steps: 1) Establish a total pre-order of all paths using leximin operator Then, select a first set of cidate classes corresponding to leaves of best paths in the total preorder Let be this set of classes 2) Refine by selecting its leximax-preferred class(s) using Definition 3 Example 2: Let us continue the previous example According to the leximin criterion, we get the following total preorder between different paths: Thus which are the classes labeling, respectively In other terms the connection will be classified as a Probing or a DOS attack Then, since Thus, it is possible to have a more precise result the connection will be classified as a Probing attack V EVALUATION OF POSSIBILISTIC DECISION TREES When dealing with an uncertain context, the evaluation of a classifier namely a possibilistic decision tree is not so obvious A Percent of Correct Classification In the classical case, the corresponds to the proportion of the number of well classified objects on the whole number of objects However, as within a possibilistic decision tree, a new object may not be classified in a unique class, it will be necessary to adopt the to the uncertainty pervading classes Thus, the idea is to choose for each object to classify the class having the highest degree of possibility degree If more than one class is obtained, then one of them is chosen romly The obtained class is considered as the class of the testing object Hence, the relative to the whole testing set is computed by making comparison, for each testing instance, between its real class (known by us) the class obtained by the induced tree number of well classified objects number of testing objects where the number of well classified objects is computed as the sum of testing objects for which the class obtained by the possibilistic decision tree (the most plausible class) is the same as their real class B Distance criterion The limitation of the adaptative is that it ignores the order that exists between the different classes that may correspond to the chosen class It only considers the most plausible class So, we propose a criterion allowing to take into account the order of the classes characterizing the object to classify More exactly, we propose to compare the ranking assigned to classes with the real class of the given testing object Such comparison is based on a kind of distance At first, we should define a qualitative possibility distribution assigned to the object as follows: (2)

Assume we hle n classes (,,, ), then: if if does not appear in the order of the classes relative to the object to classify (3) Where represents the decreasing classing of over the other Next, we define the distance criterion for a testing object (where its possibility distribution is ) with respect to its real class denoted as follows: Assume that the real class of the object DOS (D) Using the Equation 4, we get: Then, we get the distance : is the attack where if otherwise This distance verifies the following property: when is close to 2, the classifier is bad, whereas when it falls to 0, it is considered as a good classifier In order to give to this distance a signification more close to the, we propose to make changes on ( it will be denoted ), so as it satisfies the following property: Next, we have to compute the average total distance relative to all the classified testing instances denoted So, we get: classified objects number of classified objects Thus, will be considered as a calibrated on the whole testing set C Example Let s continue with our example where we deal with three classes namely, To classify the connection given in Example 2, we get (according to the induced tree) the following order: Hence, the corresponding possibility distribution (see Equation 2) will be: (4) (5) (6) (7) (8) So, Thus, we get 39% of chance that the induced tree detects the real class of whereas, applying directly the classical PCC on the most plausible class leads directly to an erroneous result Obviously, we can apply this distance criterion on all the testing set using Equation 8 VI CONCLUSION In this paper, we have presented two contributions The first one concerns the classification, using decision trees, of objects characterized by uncertain attribute values where uncertainty is represented in a qualititative possibilistic framework Indeed, to overcome limitations of the stard min-max combination, we have proposed a lexmin/leximax combination mode in the classification phase In the second part, we have proposed a new criterion to judge the efficiency of classifiers under an uncertain context, namely the qualitative possibilistic decision trees This criterion takes into account the total pre-order of classes relative to each testing instance not only the best one as in the classical Percent of Correct Classification A future work will be to introduce a semantic distance to this criterion allowing to adjust the degree of similarity between classes REFERENCES [1] Ben Amor N, Benferhat S, Elouedi Z, Mellouli K: Decision Trees Qualitative Possibilistic Inference: Application to the Intrusion Detection Problem Proceedings of European Conference of Symbolic Quantitative Approaches to Reasoning Uncertainty (ECSQARU 2003), 419-431, 2003 [2] Breiman, L, Friedman, J H, Olshen, R A, Stone, C J: Classification regression trees Monterey, CA, Wadsworth & Brooks, 1984 [3] D Dubois H Prade: Possibility theory: An approach to computerized Processing of uncertainty Plenium Press, New York, 1988 [4] Denoeux T, Skarstein-Bjanger M: Induction of decision trees for partially classified data Proceedings of the IEEE International Conference on Systems, Man, Cybernetics, Nashville, USA, 2923-2928, 2000 [5] Elouedi Z, Mellouli K, Smets P: Belief decision trees: Theoretical foundations International Journal of Approximate Reasoning 28, 91-124, 2001 [6] Hullermeier E, Possibilistic induction in decision-tree learning, ECML 02, 2002 [7] Mitchell, T M: Decision tree learning Chapter 3 of Machine Learning, Co-published by the MIT Press the McGraw-Hill Compagnies, Inc, 1997

[8] Moulin H: Axioms for cooperative decision-making Cambridge University Press, 1988 [9] Quinlan, J R: Induction of decision trees, Machine Learning 1, 1-106, 1986 [10] Quinlan, J R: Probabilistic decision trees, Machine Learning, Vol 3, Chap 5, Morgan Kaufmann, 267-301, 1990 [11] Quinlan, J R: C45: Programs for machine learning Morgan Kaufmann San Mateo Ca, 1993