Classification with Diffuse or Incomplete Information

Similar documents
Classification with Diffuse or Incomplete Information

Efficient Rule Set Generation using K-Map & Rough Set Theory (RST)

Chapter 4 Fuzzy Logic

Fuzzy-Rough Sets for Descriptive Dimensionality Reduction

ROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING WITH UNCERTAINTY

ROUGH SETS THEORY AND UNCERTAINTY INTO INFORMATION SYSTEM

Survey on Rough Set Feature Selection Using Evolutionary Algorithm

Rough Sets, Neighborhood Systems, and Granular Computing

A Decision-Theoretic Rough Set Model

EFFICIENT ATTRIBUTE REDUCTION ALGORITHM

Information Granulation and Approximation in a Decision-theoretic Model of Rough Sets

A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values

On Reduct Construction Algorithms

CHAPTER 4 FREQUENCY STABILIZATION USING FUZZY LOGIC CONTROLLER

American International Journal of Research in Science, Technology, Engineering & Mathematics

A Classifier with the Function-based Decision Tree

Attribute Reduction using Forward Selection and Relative Reduct Algorithm

Input: Concepts, Instances, Attributes

A study on lower interval probability function based decision theoretic rough set models

An Efficient Method for Extracting Fuzzy Classification Rules from High Dimensional Data

Application of Or-based Rule Antecedent Fuzzy Neural Networks to Iris Data Classification Problem

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES

Cse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University

FUZZY LOGIC TECHNIQUES. on random processes. In such situations, fuzzy logic exhibits immense potential for

Irregular Interval Valued Fuzzy Graphs

Rough Approximations under Level Fuzzy Sets

On Generalizing Rough Set Theory

CHAPTER 5 FUZZY LOGIC CONTROL

Collaborative Rough Clustering

A New Approach for Handling the Iris Data Classification Problem

Review of Fuzzy Logical Database Models

Efficient SQL-Querying Method for Data Mining in Large Data Bases

Rough Set Approach to Unsupervised Neural Network based Pattern Classifier

Final Exam. Controller, F. Expert Sys.., Solving F. Ineq.} {Hopefield, SVM, Comptetive Learning,

Rough Set Approaches to Rule Induction from Incomplete Data

MINING ASSOCIATION RULES WITH UNCERTAIN ITEM RELATIONSHIPS

Fuzzy Sets and Systems. Lecture 1 (Introduction) Bu- Ali Sina University Computer Engineering Dep. Spring 2010

Minimal Test Cost Feature Selection with Positive Region Constraint

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Texture Image Segmentation using FCM

Mining High Order Decision Rules

Clustering Analysis based on Data Mining Applications Xuedong Fan

Climate Precipitation Prediction by Neural Network

Data Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai

ADAPTIVE NEURO FUZZY INFERENCE SYSTEM FOR HIGHWAY ACCIDENTS ANALYSIS

Min-Uncertainty & Max-Certainty Criteria of Neighborhood Rough- Mutual Feature Selection

Formal Concept Analysis and Hierarchical Classes Analysis

Analytical Techniques for Anomaly Detection Through Features, Signal-Noise Separation and Partial-Value Association

Optimization with linguistic variables

A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set

Radial Basis Function (RBF) Neural Networks Based on the Triple Modular Redundancy Technology (TMR)

FUZZY LOGIC WITH ENGINEERING APPLICATIONS

Granular Computing: A Paradigm in Information Processing Saroj K. Meher Center for Soft Computing Research Indian Statistical Institute, Kolkata

Machine Learning Chapter 2. Input

Ordered Weighted Average Based Fuzzy Rough Sets

Data Mining & Feature Selection

Lecture notes. Com Page 1

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska

Introduction to Artificial Intelligence

Basic Concepts Weka Workbench and its terminology

Finding Rough Set Reducts with SAT

Similarity Measures of Pentagonal Fuzzy Numbers

Semantics of Fuzzy Sets in Rough Set Theory

ARTIFICIAL INTELLIGENCE. Uncertainty: fuzzy systems

Fuzzy Analogy: A New Approach for Software Cost Estimation

Unit V. Neural Fuzzy System

A Novel Fuzzy Rough Granular Neural Network for Classification

DISCRETE DOMAIN REPRESENTATION FOR SHAPE CONCEPTUALIZATION

LEARNING WEIGHTS OF FUZZY RULES BY USING GRAVITATIONAL SEARCH ALGORITHM

Granular Data Mining and Rough-fuzzy Computing: Data to Knowledge and Big Data Issues

Intelligent flexible query answering Using Fuzzy Ontologies

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.

Applying Fuzzy Sets and Rough Sets as Metric for Vagueness and Uncertainty in Information Retrieval Systems

CHAPTER 3 FUZZY RULE BASED MODEL FOR FAULT DIAGNOSIS

On Sample Weighted Clustering Algorithm using Euclidean and Mahalanobis Distances

GEOG 5113 Special Topics in GIScience. Why is Classical set theory restricted? Contradiction & Excluded Middle. Fuzzy Set Theory in GIScience

Why Fuzzy? Definitions Bit of History Component of a fuzzy system Fuzzy Applications Fuzzy Sets Fuzzy Boundaries Fuzzy Representation

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)

APPROXIMATION DISTANCE CALCULATION ON CONCEPT LATTICE

IN recent years, neural networks have attracted considerable attention

KDSODTEX: A Novel Technique to Extract Second Order Decision Table Using KDRuleEx

Data with Missing Attribute Values: Generalization of Indiscernibility Relation and Rule Induction

Redefining and Enhancing K-means Algorithm

Statistical Methods in AI

Fuzzy Reasoning. Outline

Keywords Fuzzy, Set Theory, KDD, Data Base, Transformed Database.

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Feature Selection from the Perspective of Knowledge Granulation in Dynamic Set-valued Information System *

Techniques for Dealing with Missing Values in Feedforward Networks

ECLT 5810 Clustering

Introduction to Fuzzy Logic and Fuzzy Systems Adel Nadjaran Toosi

3.2 Level 1 Processing

A fuzzy soft set theoretic approach to decision making problems

Distinguishing the Noise and image structures for detecting the correction term and filtering the noise by using fuzzy rules

Available online at ScienceDirect. Procedia Computer Science 96 (2016 )

IJISET - International Journal of Innovative Science, Engineering & Technology, Vol. 1 Issue 3, May

Salman Ahmed.G* et al. /International Journal of Pharmacy & Technology

Transcription:

Classification with Diffuse or Incomplete Information AMAURY CABALLERO, KANG YEN Florida International University Abstract. In many different fields like finance, business, pattern recognition, communication and many other applications, analysts are often faced with the task of classifying items based on historical or measured data. A major difficulty faced by such analysts is that data to be classified can often be quite complex, with numerous interrelated variables, or even incomplete. The time and effort required to develop a model to solve accurately such classification problems can be significant. The use of rough sets and fuzzy logic give the analyst a powerful tool for solving the proposed task. The work is focused on analyzing the advantages of these methods, from the point of view of their simplicity and time consuming. Key Words: Fuzzy Logic, Rough Sets, Classification, Databases. 1 Introduction When analyzing an information system or a database, frequently there are found problems like attributes redundancy, missing or diffuse values, which are due in general to noise, and those related to interpretation or evaluation. Rough set theory is a very useful approach for minimizing the number of attributes necessary to represent the desired category structure, and this way eliminating redundancy. The lack of data or complete knowledge of the system makes developing a model a practically impossible task using conventional means. This lack of data can be produced for sensors failure, or simple due to incomplete system information. At last, diffuse values are related to noise or imprecise sensors measurements. In many applications, the information is obtained from different sensors, which are corrupted by noise and outliers. The present work is devoted to the analysis of these situations and the way they can be solved using rough and fuzzy sets. The last part is devoted to solving an example applying a simple method, using the concepts of rough and fuzzy sets, for making the classification 2 Methodologies 2.1 Rough Sets Rough Set Theory (RST), an extension of conventional set theory, that supports approximations in decision making, and was introduced by Pawlak in 1982 [1]. This theory can be used as a tool to discover data dependencies and to reduce the number of attribute contained in a data set using the data alone, requiring no additional information [2]. Powlak defines an information system by a pair K = (U, A), where U is the universe of all the objects and A is a finite set of attributes[3]. Any attribute a A is a total function of a: U V a, where V a is the domain of a. Let a(x) denote the value of attribute a for object x, where [x] A is the corresponding equivalent class containing object x. We say that (x,y) is A-indiscernible for the equivalence relation I(A), where U/I(A) denotes the partition determined by the ISBN: 978-960-6766-63-3 400 ISSN: 1790-5117

relation I(A), if (x,y) belongs to I(A) [4]. It is possible to define two operations on any subset X of U, the lower approximation (positive region), defined as follows: A(X) = {x U: [x] A X} is the union of all equivalent classes in [x] A which are subsets of X. The lower approximation is the complete set of objects that can be unequivocally classified as belonging to set X. This is called the positive region, where if A and X are equivalence relations over U: POS A (U) = X U/A A(X) The upper approximation A(X) = {x U: [x] A X 0} is the union of all equivalence classes in [x] A which are non-empty intersection with the target set. The upper approximation is the complete set of objects that are possible members of the target set X. It is important the concept of reduct, which is the minimum number of attributes that can characterize the knowledge in the information system as a whole, or a subset of it. In any case, the reducts are not unique. The set of attributes which is common to all reducts is called core. Rough set attribute reduction provides a tool by which knowledge is extracted from a dataset, retaining the information content, without affecting the knowledge involved. In the literature there are several methods for attribute reduction. R. Jensen [2] analyzes the problem of finding a reduct of an information system. It becomes clear that the perfect solution to locating such a subset is to generate all possible subsets and retrieve those with a maximum rough set dependency degree, but this solution is not practical for medium and large databases. One method for practical reduce the number of attempts is the use of the QUICK REDUCT algorithm [2]. It starts off with an empty set and adds in turn, one at a time, those attributes that result in the greatest increase in the rough dependency metric, until it produces its maximum possible value for the dataset. For very large datasets, one criterion for stopping the search could be to terminate when there is no further increase in the dependency. Another method uses the discernibility matrix approach. A discernibility matrix of a decision table D = (U, X D) is a symmetric U x U matrix, where the entries correspond to d ij = {a U/a(x i ) a(x j )} i,j = 1,...,U Each d ij contains those attributes that differ between objects i and j. The function for minimizing the attributes using the discernibility matrix is a Boolean function given for each term by f i (a 1,,a m ) = { d ij } Finding the set of all prime implicants of the discernibility function, all the minimal reducts of the system may be determined. 2.2. Fuzzy Sets Fuzzy logic is a multi-valued Boolean logic that helps describing concepts that are commonly encountered in the real world, using linguistic variables. One of the basic concepts in fuzzy logic is that of the membership functions. In general any function A: X [0,1] can be used as a membership function describing a fuzzy set. When designing a fuzzy system, the expert encounters besides the selection of the input and output functions and their universe of discourse, the problem of optimizing the number and characteristics of the membership ISBN: 978-960-6766-63-3 401 ISSN: 1790-5117

functions as well as the rules that control the process. One important concept in fuzzy logic to be applied in this work is the Compatibility Index (CI), presented by Cox [5], which is a method for calculating the similarity of a situation with previously imposed conditions using fuzzy sets. This index can be calculated as the average of the degrees of membership (µ) for each object using the expression: m CI i = μ k / m k = 1 where i = 1,, n is the object number and k = 1,, m is the attribute number. 3. Methods for Classification of Diffuse or Incomplete Information. The theory of rough sets is used for the analysis of categorical data. The values that can be obtained for the different attributes characterizing the objects of interest may be very large, which can lead to the creation of too many rules. As expressed by Y Leung [6], these rules may be accurate with reference to the training data set; their generalization ability will most likely be rather low since perfect match of attribute values of the condition parts in real numbers is generally difficult if not impossible. To make the identified classification rules more comprising and practical, a preprocessing step which can transform the real numbered attribute values into a sufficiently small number of meaningful intervals is thus necessary. This method of approach is known as granular computing. As indicated by Y.Y. Yao [7, 8], when a problem involves incomplete, uncertain, or vague information, it may be difficult to differentiate distinct elements and one is forced to consider granules. This approach leads to the simplification of practical problems. R Jensen [2] presents a fuzzy-rough feature selection handling noisy data, to perform discretization before dealing with the data set. Several algorithms and examples are presented [2]. A generalization of rough set models based on fuzzy lower approximation is presented by Wang [4]. The concept of tolerant lower approximation is introduced for dealing with noisy data. When dealing with incomplete information, the typical approach is to use fuzzy logic, deriving inference rules from training examples using the available information and assuming that the characteristics of the missing part are similar to that of the previously obtained. Literature defines many methods to derive membership functions and fuzzy rules from training examples [9, 10]. Among them can be cited the Batch Least Square Algorithm, the Recursive Least Square Algorithm, the Gradient Method, the Learning from Example method, etc. E. A. Rady [11] introduced the modified similarity relation, for dealing with incomplete information systems, which is dependent on the number of missing values with respect to the number of the whole defined attributes for each object. Also neural networks and genetic algorithms have been used for classification with incomplete data. E. Granger, [12] present a work, analyzing the situations of limited number of training cases, missing components of the input patterns, missing class labels during training, and missing classes, using the fuzzy ARTMAP neural network. ISBN: 978-960-6766-63-3 402 ISSN: 1790-5117

4 Practical Example for Diffuse Information Noise or imprecision may cause the information from sensors to be diffuse. One example of solving classification problems using fuzzy sets and rough theory is presented by T. Ying-Chieh, [13]. He uses a rough classification approach and a minimization entropy algorithm for solving the iris classification problem. On Table 1 presented in [13], the classes are defined as SL-sepal length, SW-sepal width, PL-petal length, and PW-petal width. These classes will be used in our example, where the mean and standard deviation, values have been calculated, as well as the minimum and maximum for each class have been included. The results are presented in Table 1. Another approach is recommended by Leung Yee [6]. He starts defining several concepts partially reproduced with small changes here. This approach will be used in the present paper, applied to the iris classification problem. Misclassification Rates Probability that objects in class u i are misclassified in class u j according to attribute a k : α ij k = 0 if [l i k, u i k ] [l j k, u j k ] = 0; α ij k = min {min{(u i k - l j k, u j k - l i k )/( u i k - l i k )}}, if [l i k, u i k ] [l j k, u j k ] 0. where l i k and u i k are respectively, the minimum and maximum values for the object i and attribute k; l j k and u j k are respectively, the minimum and maximum values for the object j and attribute k. Maximal Mutual Classification error between Classes β ij k = max { α ij k, α ji k }, Permissible Misclassification Rate Between Classes u i and u j β ij = min β ij k for 1 k m Defining a parameter α as the specified admissible classification error, 0 α 1 if β ij α, there must exist an attribute a k so that, by using a k, the two classes u i and u j can be separated within the permissible misclassification rate α [6]. α-tolerance Relation Matrix for a given β ij, and an attribute subset B A, R B α = {(u i, u j ) U x U: β ij k α a k B The error that objects in class u i being misclassified into class u j in the system is defined as α ij = min {α ij k : k m} and are given in Table 2. Table 2.Error of Misclassification of Object u i into u j αij u 1 u 2 u 3 u 1 1 0 0 u 2 0 1 0 u 3 0.25 0.29 1 The maximal mutual classification error between classes is defined by β k ij = max { α k ij, α k ji }, and in the present example is given by: β 1 12 = 0.6 1 β 13 = 0.6 β 1 23 = 1 β 2 12 = 0.78 β 2 13 2 = 0.94 β 23 = 0.86 β 3 12 = 0 β 3 13 = 0 β 3 23 = 0.29 β 4 12 = 0 β 4 13 = 0 β 4 23 = 0.5 Selecting α = 0.2, the permissible misclassification rate for the present example is shown on Table 3. Table 3. Permissible Misclassification Rate Between Classes u i and u j β ij u 1 u 2 u 3 u 1 1 0 0 u 2 0 1 0.29 u 3 0 0.29 1 ISBN: 978-960-6766-63-3 403 ISSN: 1790-5117

The matrix for the α-tolerance relations, where all β ij α is represented by 1. 100 R 0.2 A = 011 011 From the matrix, it is clear that object u 1 - Setosa can be uniquely defined from the given attributes, but objects u 2 - Versicolor and u 3 -Virginica may not be separated. This situation can be expressed by S 0.2 A (u 1 ) = {u 1 } S 0.2 A (u 2 ) = S 0.2 A (u 3 ) = {u 2, u 3 }, where S 0.2 A (u) denotes that these are the sets of objects which are possible indiscernible by A within u within the misclassification rate α = 0.2. From the previous results, the 0.2- discernibility set is given on Table 4. Table 4. Discernibility Set u 1 u 2 u 3 u 1 u 2 a 3, a 4 u 3 a 3, a 4 The obtained functions are f 1 0.2 = a 3 a 4 f 2 0.2 = a 3 a 4 The following rules can be extracted: R(u 1 ) IF a 3 [1, 1.9] or a 4 [0.1,0.6] THEN it is u 1- Setosa. R(u 2 ) - IF a 3 [3, 5.1] or a 4 [1, 1.8] THEN it can be u 2- Versicolor or u 3 -Virginica. Rule 1 is clear. In order to select between u 2 and u 3, one possibility is to use fuzzy logic. The authors propose the following procedure: 1) For each of the objects u i in the fired rules, find the degree of membership with the imposed conditions, for each of the participating attributes a k. 2) Find the compatibility index (CI) for each object. 3) Compare the different compatibility indexes and select the object with the greater one. For the example, assuming for each interval a triangular membership function with the maximum value coincident with the mean value for the interval, and defining the domain from the minimum to maximum values in the interval, taking as inputs: SL (sepal length) = 5.6 SW (sepal width) =3.0 PL (petal length) = 4.5 PW (petal width) = 1.5 The compatibility indexes are calculated, obtaining for versiclor 0.67, and for virnica 0.3. From the two compatibility indexes, it is clear that the selection is versicolor. A test was made to all the objects in the table used in the example [13]. In this case, there are 50 virginica and 50 versicolor examples. The algorithm fails in one versicolor and one virginica. This gives an average classification rate of 98% for the analyzed table. 4 Conclusions The combination of rough sets and fuzzy logic is a powerful tool for classifying objects into a database. The attribute minimization using rough set theory is extremely useful when dealing with big databases, but if the numbers of possible values for the attributes is too big, or when dealing with diffuse or missing data, the use of fuzzy logic and the selection of interval values is mandatory. Several methods are presented for the classification with imprecise or missing information. It is not possible to affirm that one method is always better than other. In any case, there is always some uncertainty regarding the final result. ISBN: 978-960-6766-63-3 404 ISSN: 1790-5117

This uncertainty is related to the accepted admissible classification error (α) in the case of imprecise information and to the characteristics and quantity of the missed information, in the case of incomplete information. The solution of the iris classification using a method that differs from the originally presented in the literature shows a different approach for obtaining equivalent results. The solution of the iris classification in the present work, results simpler than the method using the entropy-based approach due to the fact that fewer rules have been used. References [1] Pawlak Z. Rough Set. International Journal of Computer and Information Sciences. No. 11, 1982, pp.341-356. [2] Jensen R. Combining Rough and Fuzzy Sets for Feature Selection. Ph.D. Dissertation, University of Edinburgh, 2005. [3] Pawlak Z. Rough Sets. Theoretical Aspects of Reasoning About Data. Dordrecht:Kluwer, 1991. [4] Wang, Feng-Hsu. On Acquiring Classification Knowledge from Noisy Data Based on Rough Set. Expert Systems with Applications. No. 29, 2005, pp. 49-64 [5] Cox, E.D., Fuzzy Logic for Business and Industry, Rockland: Charles River Media, 1995 [6] Leung, Y. et Al. A Rough Set Approach for the Discovery of Classification Rules in Intervalvalued Information Systems. International Table 1. Attributes for Each Class Journal of Approximate Reasoning, No. 47, 2007, pp. 233-246. [7] Yao, Y. Y. Information Granulation and Rough Set Approximation. International Journal of Intelligent Systems. Vol. 16, No. 1,2001, pp. 87-104. [8] Yao, Y. Y. Information Granulation and Approximation in a Decision Theoretic Model of Rough Sets. Rough-Neuro Computing with Words. Edited by L. Polkowski et Al. Springuer, Berlin, 2003, pp. 491-518. [9] Ross, J. T. Fuzzy Logic with Engineering Applications. John Wiley and Sons, Inc., 2004. [10] Pasino, K., S. Yurkovich. Fuzzy Control. Addison Wesley Longman, Menlo Park, California. [11] Rady E. A. et Al. A Modified Rough Set approach to Incomplete Information /systems. Journal of Applied Mathematics and Decision Sciences. 2007. [12] Granger E. et Al. Clasification of Incomplete Data Using the Fuzzy ARTMAP Neural Network. Technical Report. Boston University Center for Adaptive Systems and Department of Adaptive Neural Systems. January 2000. [13] Ying-Chieh, T. et Al. Entropy-Based Fuzzy Rough Classification Approach for Extracting Classification Rules. Expert systems with Applications. No.31, 2006, pp. 436-443. Attributes Setosa Versicolor Virginica x av σ Min Max x av σ Min Max x av σ Min Max SL 5.0 0.35 4.3 5.8 5.94 0.52 4.9 7.0 6.59 0.64 4.9 7.9 SW 3.42 0.38 2.3 4.4 2.77 0.31 2.0 3.4 2.91 0.32 2.2 3.8 PL 1.45 0.11 1.0 1.9 4.26 0.47 3.0 5.1 5.55 0.55 4.5 6.9 PW 0.24 0.11 0.1 0.6 1.33 0.20 1.0 1.8 2.03 0.27 1.4 2.5 ISBN: 978-960-6766-63-3 405 ISSN: 1790-5117