Feature Selection Based on Relative Attribute Dependency: An Experimental Study

Size: px
Start display at page:

Download "Feature Selection Based on Relative Attribute Dependency: An Experimental Study"

Transcription

1 Feature Selection Based on Relative Attribute Dependency: An Experimental Study Jianchao Han, Ricardo Sanchez, Xiaohua Hu, T.Y. Lin Department of Computer Science, California State University Dominguez Hills 1000 E. Vistoria Street, Carson, CA College of Information Science and Technology, Drexel University 3141 Chestnut Street, Philadelphia, PA Department of Electrical Engineering and Computer Science, University of California, Berkeley, California Glossary Rough set : A rough set is defined by the lower and upper approximations of a concept. The lower approximation contains all elements that necessarily belong to the concept, while the upper approximation contains those that possibly belong to the concept. In rough set theory, a concept is considered a classical set. Reduct :The task of rough set attribute reduction is to find a subset of the conditional attributes set, which functions as the original conditional attributes set without loss of classification capability. This subset of the conditional attributes set is called reduct. Attribute dependency :The degree of attribute dependency provides a measure how an attributes subset is dependent on another attributes subset. Relative attribute dependency :The relative attribute dependency degree can be calculated by counting the distinct instances of the subset of the data set, instead of generating discernibility functions or positive regions. 1

2 Abstract : Most existing rough set-based feature selection algorithms suffer from intensive computation of either discernibility functions or positive regions to find attribute reduct. In this paper, we develop a new computation model based on relative attribute dependency defined as the proportion of the projection of the decision table on a condition attributes subset to the projection of the decision table on the union of the condition attributes subset and the decision attributes set. To find an optimal reduct, we use information entropy conveyed by the attributes as the heuristic. A novel algorithm to find optimal reducts of condition attributes based on the relative attribute dependency is implemented using Java, and is experimented with 10 data sets from UCI Machine Learning Repository. We conduct the comparison of data classification using C4.5 with the original data sets and their reducts. The experiment results demonstrate the usefulness of our algorithm. 1 Introduction There are many factors affecting the performance of data analysis, and one prominent factor is the size of the data set. In the era of information, the availability of huge amounts of computerized data that many organizations possess about their business and/or scientific research attracts many researchers from different communities such as statistics, bioinformatics, databases, machine learning and data mining. Most data sets collected from real world applications contain noisy data, which may distract the analyst and mislead to nonsense conclusions. Thus the original data need to be cleaned in order to not only reduce the size of the dataset but also remove noise as well. This data cleaning is usually done by data reduction. Feature selection has long been an active research topic within statistics, pattern recognition, machine learning and data mining. Most researchers have demonstrated the interest in designing new methods and improving the performance of their algorithms. These methods can be divided into two types: exhaustive or heuristic search. The exhaustive search probes all possible subsets chosen from the original features. This is prohibitive when the number of the original features is large. In practice, the heuristic search is the way out of this exponential computation and in general makes use of background information to approximately estimate the relevance of features. Although the heuristic search works reasonably well, it is certain that some features with high order correlation may be missed out. Rough set theory has been used to develop feature selection algorithm by finding condition attribute reduct. Most existing rough set-based feature selection algorithms suffer from intensive computation of either discernibility functions or positive regions to find attribute reduct. In order to improve the efficiency, in this paper, we develop a new computation model based on relative attribute dependency. With this model, a novel algorithm to find optimal reducts of condition attributes based on the relative attribute dependency are proposed and implemented. The implemented algorithm is experimented with 10 data sets from UCI Machine 2

3 Learning Repository. The experiment results demonstrate their usefulness and are analyzed for further research. 2 Rough Set Approach Rough set theory was developed by Palwak [11] in the early 1980s and has been used in data analysis, pattern recognition, and data mining and knowledge discovery [8, 14]. Recently, rough set theory has also been employed to select feature subset [4, 10, 11, 15, 17]. In the rough set community, feature selection algorithms are attribute-reduct oriented, that is, finding optimal reduct of condition attributes of a given data set. Two main approaches to finding attribute reducts are recognized as discernibility function-based and attribute dependency-based [3, 11]. These algorithms, however, suffer from intensive computations of either discernibility functions for the former or positive regions for the latter, although some computation efficiency improvement has been made in some new developments. In rough set theory, the data is collected in a table, called decision table. Rows of the decision table correspond to instances, and columns correspond to features (or attributes). All attributes are recognized into two groups: conditional attributes set C as input and decision attributes set D as output. Assume P C D and Q C D, the positive region of Q with respect to P, denoted POS P (Q), is defined as POS P (Q) = def X U/IND(D) PX, where PX is the lower approximation of X and U/IND(D) is the equivalent partition induced by Q. The positive region of Q with respect to P contains all objects in U that can be classified using the information contained in P. With this definition, the degree of dependency of Q from P, denoted γ P (Q), is defined as P OS γ P (Q) = P (Q) def, where X denotes the cardinality of the set X. U The degree of attribute dependency provides a measure how an attributes subset is dependent on another attributes subset. γ P (Q) = 1 means that Q totally depends on P, γ P (Q) = 0 indicates that Q is totally independent from P, while 0 < γ P (Q) < 1 denotes a partially dependency of Q from P. Particularly, assume P C, then γ P (D) can be used to measure the dependency of the decision attributes from a conditional attributes subset. The task of rough set attribute reduction is to find a subset of the conditional attributes set, which functions as the original conditional attributes set without loss of classification capability. This subset of the conditional attributes set is called reduct, and defined as follows [14]. R C is called a reduct of C, if POS R (D) = POS C (D), or equivalently, γ R (D) = γ C (D). A reduct R of C is called a minimum reduct of C if Q R, Q is not a reduct of C. A reduct R of C has the same expressiveness of instances as C with respect to D. A decision table may have more than one reduct. Anyone of them can be used to replace the original 3

4 condition attributes set. Finding all the reducts from a decision table, however, is NP-hard. Thus, a natural question is which reduct is the best. Without domain knowledge, the only source of information to select the reduct is the contents of the decision table. For example, the number of attributes can be used as the criteria and the best reduct is the one with the smallest number of attributes. Unfortunately, finding the reduct with the smallest number of attributes is also NP-hard. Some heuristic approaches to finding a good enough reduct have been proposed. A recent algorithm, called QuickReduct, was developed by Shen and Chouchoulas [18] in QuickReduct is a filter approach of feature selection and a forward searching hill climber. QuickReduct initializes the candidate reduct R as an empty set, and attributes are added to R incrementally using the following heuristic: the next attribute to be added to R is the one with the highest significance to R with respect to the decision attributes. R is increased until R becomes a reduct. The basic idea behind this algorithm is that the degree of attribute dependency is monotonically increasing. There are two problems with this algorithm, however. First, it is not guaranteed to yield the best reduct with the smallest number of attributes. Second, to calculate the significance of attributes, the discernibility function and positive regions must be computed, which is inefficient and time-consuming. A variant of QuickReduct, called QuickReduct II is also a filter algorithm, but performs the backward elimination using the same heuristic [18]. 3 Relative Attribute Dependency Based on Rough Set Theory In order to improve the efficiency of algorithms to finding optimal reducts of condition attributes, we proposed a new definition of attribute dependency, called relative attribute dependency, with which we showed a sufficient and necessary condition of the optimal reduct of conditional attributes [4]. The relative attribute dependency degree can be calculated by counting the distinct instances of the subset of the data set, instead of generating discernibility functions or positive regions. Thus the computation efficiency of finding minimum reducts is highly improved. Most existing rough set-based attribute reduction algorithms suffer from intensive computation of either discernibility functions or positive regions. In the family of QuickReduct algorithms, in order to choose the next attribute to be added to the candidate reduct, one must compute the degree of dependency of all remaining conditional attributes from the decision attributes. This means that the positive regions POS R /P/ (D), p C -R, must be computed. To improve the efficiency of the attribute reduction algorithms, we define a new concept, called the degree of relative attribute dependency. For this purpose, we assume that the decision table is consistent, that is, t, s U, if f D (t) f D (s), then q C such that f q (t) f q (s). This assumption is not realistic in most real-life applications. Fortunately, any decision table can be uniquely decomposed into two decision tables, with one being consistent and the other the boundary area, and our method could be performed on the consistent one. 4

5 We first define the concept of projection and then define the relative attribute dependency. Let p C D. The projection of U on P, denoted as P (U), is a sub-table of U and constructed as follows: 1) eliminating attributes C D P; and 2) merging all indiscernible tuples (rows). Let Q C. The degree of relative dependency, denoted K Q (D), of Q on D over U is defined as Q K Q (D) = (U) Q (U), where X (U) is actually the number of equivalence classes in U/IND(X). D The relative attribute dependency is the proportion of the projection of the decision table on a condition attributes subset to the projection of the decision table on the union of the condition attributes subset and the decision attributes set. On the other hand, the regular attribute dependency is the proportion of the positive region of one attributes subset with respect to another attributes subset to the decision table. With the relative attribute dependency measure, we propose a new computation model to find a minimum reduct of condition attributes in a consistent decision table, which is described as follows. The Computation Model Based on Relative Attribute Dependency (RAD): Input:A consistent decision table U, condition attributes set C and decision attributes set D Output: A minimum reduct R of condition attributes set C with respect to D in U Computation: Find a subset R of C such that K R (D) = 1, and Q RK Q (D) < 1. The following theory shows that our proposed computation model is equivalent to the traditional model. The correctness of our model is built on the following condition: a subset of condition attributes is a minimum reduct in the tradition model if and only if it is a minimum reduct of condition attributes in our new model. Theorem [4]. Assume U is consistent. R C is a reduct of C with respect to D if and only if 1) K R (D) = K C (D) = 1 ; and 2) Q R, K Q (D) < K C (D). The degree of relative attribute dependency provides a mechanism of finding a minimum reduct of the conditional attributes set of a decision table. This dependency measure can be more efficiently calculated than the traditional functional computation. 4 A Heuristic Algorithm for Finding Optimal Reducts Some authors propose algorithms for constructing the best reduct, but what is the best depends on how to define the criteria, such as the number of attributes in the reduct. In the absence of criteria, the only source of information to select the reduct is the content of the data table. A common metric of data content is information entropy contained in the data items. In this section, we develop a heuristic algorithm to implement the proposed model based on the relative attribute dependency. The algorithm is based on the heuristic backward elimination in 5

6 terms of the information entropy conveyed by condition attributes. The algorithm calculates the information entropy conveyed in each attribute and selects the one with the maximum information gain for elimination. The goal of the algorithm is to find a subset R of the condition attributes set C such that R has the same classification power as C with respect to the given decision table. As our model suggests, such R is a minimum reduct of C with the total relative dependency on the decision attributes set D. To find such an R, we initialize R to containing all condition attributes in C, and then eliminate redundant attributes one by one. Given the partition by D, U/IND(D), of U, the entropy, or expected information based on the partition by q C, U/q, of U, is given by E(q) = Y Y U/q I(q Y ), where I(q Y) = U - Y X Y X U/IND(D) log X Y 2. Thus, the entropy E(q) can be represented as Y E(q) = - 1 U X U/IND(D) Y U/q X X Y log Y 2. Y Algorithm A C Attribute information entropy based backward elimination Input: Consistent decision table U, condition attributes set C, decision attributes set D Output: R a minimum reduct of condition attributes set C with respect to D in U Procedure: 1. R C, Q empty 2. For each attribute q in C do 3. Compute the entropy E(q) of q 4. Q Q {<q,e(q)>} 5. While Q Φ do 6. q arg max{e(p) <p, E(p) > Q} 7. Q Q? {< q, E(q) >} //select attribute with maximum entropy 8. If K R {q} (D) = 1 Then //if the relative dependency is 1 9. R R? {q} //remove q 10. Return R The following theorem demonstrates the correctness of Algorithm A. Theorem[4]. The outcome of Algorithm A is a minimum reduct of C with respect to D in U. Algorithm A has been implemented using the computer programming language Java. To calculate the information entropy of condition attributes and the relative dependency, the 6

7 original data set is sorted using the Radix-Sort technique. One can easily see that the time complexity of Algorithm A is O( C U log 2 U ), where C is the number of condition attributes, and U is the number of tuples in the decision table. 5 Experiments We select 10 data sets from UCI machine learning repository [2] to experiment our implemented algorithm, which are illustrated in Table 1. These data sets were carefully chosen to avoid numerical attributes and reflect diverse sizes. Since the current version of our approach only considers categorical attributes, numerical attributes need to be partitioned into nonintersected intervals. To verify our approach, we choose such data sets with small number of tuples and small number of attributes, small number of tuples and large number of attributes, large number of tuples and small number of attributes, as well as large number of tuples and large number of attributes. Table 1 describes each data set with the number of condition attributes and the number of rows. All data sets have only one decision attribute. Since some data sets such as breastcancer-wisconsin, dermatology, zoo, and audiology, contain tuple identifiers as a column which provides no information for data analysis, we remove these id columns. Table 1 also shows our experiment results using Algorithm A, where the column Number of Rows under Algorithm A gives the number of distinct tuples in the reduced data set by projecting the original data set on the reduct that the algorithm found; the column Reduct Size shows the number of condition attributes contained in the reduct. Table1: The 10 data sets excerpted from the UCI machine learning repository 7

8 The output results of Algorithm A are compared with the original data sets, illustrated in Figure 1. Figure 1 a) shows the comparison of the number of columns in the original data sets and the reducts found by our implemented algorithm. Figure 1 b) shows the comparison of number of discernible rows in the original data sets and the reducts discovered by our implemented algorithm. From the figure, one can see that in the cases where reducts were smaller than the number of condition attributes in the original data set, the number of indiscernible rows was reduced very much. To verify the effectiveness of the reducts discovered by Algorithm A, we run C4.5 [16] on both original data sets and the reudcts. The experiment results are listed in Table 2 and plot in Figure 2. From Table 2 and Figure 2, one can see that, with C4.5, using the reducts that are discovered by Algorithm A, we can obtain the little bit better classifiers than using the original data sets in most situations. Actually, only 2 of 10 data sets, namely, BCW and YS, C4.5 can find more accurate classifiers from the original data sets than from the reducts induced by our 8

9 algorithm, and the difference is very small ( 92.5%. 92.3% for BCM, 78.6% vs. 78.4% for YS ). This experimental results show us that Algorithm A is very useful and can be used to find optimal reducts for most application data sets to replace the original data sets. 6 Related Work Many feature subset selection algorithms have been proposed, and many approaches and algorithms to find classifiers based on rough set theory have been developed in the past decades. There are two approaches very close to our algorithm. Grzymala-Busse [5, 6] developed a learning system LERS that applies two algorithms LEM1 and LEM2 based on rough sets to deal with non-numerical and numerical attributes, respectively. LERS finds a minimal description of a concept, which is a set of rules. The rough measure of the rules describing a concept X is defined as X Y, where Y is the set of all examples described by the rule. This definition is Y very similar to our relative attribute dependency, which is defined as Q (U) Q U (U). Nguyen and Nguyen [11] developed an approach to first constructs the discernibility relation by sorting the data tuples in the data table, the uses the discernibility relation to build the lower and upper approximations, and finally applies the approximations to find a semi-minimal reduce. Our algorithm take advantage of the Radix-sorting technique, and has the same running efficiency as theirs, but our algorithm does not need to maintain the discernibility relation, and lower and upper approximations. Almuallim and Dietterich [1] proposed an exhaustive search algorithm, FOCUS, in The algorithm starts with an empty feature set and carries out exhaustive search until it finds a minimal combination of features that are sufficient for the data analysis task. It works on binary, noise-free data and runs in time of O(N M ). They also proposed three heuristic algorithms to speed up the search algorithm. Kira and Rendell [7] developed a heuristic algorithm, RELIEF, for data classification. RELIEF assigns a relevance weight each feature, which is meant to denote the relevance of the feature to the task. RELIEF samples instances randomly from the given data set and updates the relevance values based on the difference between the selected values and the two nearest instances of the same and opposite classes. It assumes two-class classification problems and does not help with redundant features. If most of the given features are relevant to the task, it would select most of them even though only a fraction is necessary for the classification. Another heuristic feature selection algorithm, PRESET, was developed by Modrzejewski [10] in 1993, that heuristically ranks the features and assumes a noise-free binary domain. Chi2 is also a heuristic algorithm proposed by Liu and Setiono [9] in 1995, which automatically removes irrelevant continuous features based on the χ 2 statistics and the inconsistency found in the data. Some other algorithms have been employed in data classification methods, for example, Quinlans C4.5 [16], Pagallo and Hausslers FRINGE [12]. 9

10 7 Summary and Future Work We proposed a novel definition relative attribute dependency, with which we developed a computational model for finding optimal reducts of conditional attributes. The relative attribute 9 dependency degree can be calculated by counting the distinct instances of the subset of the data set, instead of generating discernibility functions or positive regions. Thus the computation efficiency of finding minimum reducts is highly improved. We implemented an algorithm using the object-oriented programming language Java, based on the backward elimination. We experiment the implemented algorithm with 10 data sets from UCI Machine Learning Repository. These data sets are carefully excerpted to cover various situations with different number of features and tuples. Our experiment results show the algorithm significantly reduces the size of original data sets, and improves the prediction accuracy of the classifiers discovered by C4.5. Our future work will focus on the following aspects: 1) Apply more existing classification algorithms besides C4.5 to the results of our algorithms to see whether the classifier can be improved. We expect the classifier discovered in the reducts is more accurate than the one discovered in the original data sets. 2) Extend the algorithms to be able to process other types of data, such as numerical data. 3) Attempt to develop novel classification algorithms based on our definition of relative attribute dependency. References 1. Almuallim H., Dietterich, Learning Boolean concepts in the presence of many irrelevant features, Artificial Intelligence, Vol. 69(1-2), pp , Blake, C. L. and Merz, C. J. (1998). UCI Repository of machine learning databases [ mlearn/mlrepository.html]. Irvine, CA: University of California, Department of Information and Computer Science. 3. Han, J., Hu, X., and Lin T. Y., A New Computation Model for Rough Set Theory Based on Database Systems, 5th International Conference on Data Warehousing and Knowledge Discovery, Lecture Notes in Computer Science 2737, pp. 381 C 390, Han, J., Hu, X., and Lin T. Y., Feature Subset Selection Based on Relative Dependency Between Attributes, 4th International Conference on Rough Sets and Current Trends in Computing, Lecture Notes in Computer Science 3066, pp , Springer, J. W. Grzymala-Busse, LERS C A system for learning from examples based on rough sets. In Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, ed. by R. Slowinski, Kluwer Academic Publishers, 3-18, J. W. Grzymala-Busse, A Comparison of Three Strategies to Rule Induction, Proc. of the International Workshop on Rough Sets in Knowledge Discovery, Warsaw, Poland, April 5-13, pp ,

11 7. Kira, K., Rendell, L.A. The Feature Selection Problem: Traditional Methods and a new Algorithm, 9th National Conference on Artificial Intelligence (AAAI), pp , Lin, T.Y and Cercone, N., Applications of Rough Sets Theory and Data Mining, Kluwer Academic Publishers, Liu, H. and Setiono, R., Chi2: Feature Selection and Discretization of Numeric Attributes, 7th IEEE International Conference on Tools with Artificial Intelligence, Modrzejewski, M., Feature Selection Using Rough Sets Theory, European Conference on Machine Learning, pp , Nguyen, H., Nguyen, S., Some efficient algorithms for rough set methods, IPMU, , Pagallo, G., Haussler, D., Boolean Feature Discovery in Empirical Learning, Machine Learning, Vol. 5, pp 71-99, Pawlak, Z., Rough Sets, International Journal of Information and Computer Science, 11(5), pp , Pawlak, Z., Rough Sets: Theoretical Aspects of Reasoning About Data, Kluwer Academic Publishers, Quafafou, M. and Boussouf, M., Generalized Rough Sets Based Feature Selection, Intelligent Data Analysis, Vol. 4, pp. 3-17, Quinlan, J.R., C4.5: Programs for Machine Learning, Morgan Kaufmann, Sever, H., Raghavan, V., and Johnsten, D. T., The Status of Research on Rough Sets for Knowledge Discovery in Databases, 2nd International Conference on Nonlinear Problems in Aviation and Aerospace, Vol. 2, pp , Shen, Q., Chouchoulas A., A Rough-fuzzy Approach for Generating Classification Rules, Pattern Recognition, Vol. 35, pp ,

Efficient SQL-Querying Method for Data Mining in Large Data Bases

Efficient SQL-Querying Method for Data Mining in Large Data Bases Efficient SQL-Querying Method for Data Mining in Large Data Bases Nguyen Hung Son Institute of Mathematics Warsaw University Banacha 2, 02095, Warsaw, Poland Abstract Data mining can be understood as a

More information

Induction of Strong Feature Subsets

Induction of Strong Feature Subsets Induction of Strong Feature Subsets Mohamed Quafafou and Moussa Boussouf IRIN, University of Nantes, 2 rue de la Houssiniere, BP 92208-44322, Nantes Cedex 03, France. quafafou9 Abstract The problem of

More information

EFFICIENT ATTRIBUTE REDUCTION ALGORITHM

EFFICIENT ATTRIBUTE REDUCTION ALGORITHM EFFICIENT ATTRIBUTE REDUCTION ALGORITHM Zhongzhi Shi, Shaohui Liu, Zheng Zheng Institute Of Computing Technology,Chinese Academy of Sciences, Beijing, China Abstract: Key words: Efficiency of algorithms

More information

A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values

A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values Patrick G. Clark Department of Electrical Eng. and Computer Sci. University of Kansas Lawrence,

More information

Fuzzy-Rough Feature Significance for Fuzzy Decision Trees

Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Richard Jensen and Qiang Shen Department of Computer Science, The University of Wales, Aberystwyth {rkj,qqs}@aber.ac.uk Abstract Crisp decision

More information

Attribute Reduction using Forward Selection and Relative Reduct Algorithm

Attribute Reduction using Forward Selection and Relative Reduct Algorithm Attribute Reduction using Forward Selection and Relative Reduct Algorithm P.Kalyani Associate Professor in Computer Science, SNR Sons College, Coimbatore, India. ABSTRACT Attribute reduction of an information

More information

Fuzzy-Rough Sets for Descriptive Dimensionality Reduction

Fuzzy-Rough Sets for Descriptive Dimensionality Reduction Fuzzy-Rough Sets for Descriptive Dimensionality Reduction Richard Jensen and Qiang Shen {richjens,qiangs}@dai.ed.ac.uk Centre for Intelligent Systems and their Applications Division of Informatics, The

More information

Feature Selection with Adjustable Criteria

Feature Selection with Adjustable Criteria Feature Selection with Adjustable Criteria J.T. Yao M. Zhang Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: jtyao@cs.uregina.ca Abstract. We present a

More information

A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set

A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set Renu Vashist School of Computer Science and Engineering Shri Mata Vaishno Devi University, Katra,

More information

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques 24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE

More information

Finding Rough Set Reducts with SAT

Finding Rough Set Reducts with SAT Finding Rough Set Reducts with SAT Richard Jensen 1, Qiang Shen 1 and Andrew Tuson 2 {rkj,qqs}@aber.ac.uk 1 Department of Computer Science, The University of Wales, Aberystwyth 2 Department of Computing,

More information

On Reduct Construction Algorithms

On Reduct Construction Algorithms 1 On Reduct Construction Algorithms Yiyu Yao 1, Yan Zhao 1 and Jue Wang 2 1 Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {yyao, yanzhao}@cs.uregina.ca 2 Laboratory

More information

Rough Set Approaches to Rule Induction from Incomplete Data

Rough Set Approaches to Rule Induction from Incomplete Data Proceedings of the IPMU'2004, the 10th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Perugia, Italy, July 4 9, 2004, vol. 2, 923 930 Rough

More information

Feature-weighted k-nearest Neighbor Classifier

Feature-weighted k-nearest Neighbor Classifier Proceedings of the 27 IEEE Symposium on Foundations of Computational Intelligence (FOCI 27) Feature-weighted k-nearest Neighbor Classifier Diego P. Vivencio vivencio@comp.uf scar.br Estevam R. Hruschka

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

DECISION TREE INDUCTION USING ROUGH SET THEORY COMPARATIVE STUDY

DECISION TREE INDUCTION USING ROUGH SET THEORY COMPARATIVE STUDY DECISION TREE INDUCTION USING ROUGH SET THEORY COMPARATIVE STUDY Ramadevi Yellasiri, C.R.Rao 2,Vivekchan Reddy Dept. of CSE, Chaitanya Bharathi Institute of Technology, Hyderabad, INDIA. 2 DCIS, School

More information

Handling Missing Attribute Values in Preterm Birth Data Sets

Handling Missing Attribute Values in Preterm Birth Data Sets Handling Missing Attribute Values in Preterm Birth Data Sets Jerzy W. Grzymala-Busse 1, Linda K. Goodwin 2, Witold J. Grzymala-Busse 3, and Xinqun Zheng 4 1 Department of Electrical Engineering and Computer

More information

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a International Conference on Education Technology, Management and Humanities Science (ETMHS 2015) Research on Applications of Data Mining in Electronic Commerce Xiuping YANG 1, a 1 Computer Science Department,

More information

An Empirical Study of Lazy Multilabel Classification Algorithms

An Empirical Study of Lazy Multilabel Classification Algorithms An Empirical Study of Lazy Multilabel Classification Algorithms E. Spyromitros and G. Tsoumakas and I. Vlahavas Department of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece

More information

CloNI: clustering of JN -interval discretization

CloNI: clustering of JN -interval discretization CloNI: clustering of JN -interval discretization C. Ratanamahatana Department of Computer Science, University of California, Riverside, USA Abstract It is known that the naive Bayesian classifier typically

More information

An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm

An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm Proceedings of the National Conference on Recent Trends in Mathematical Computing NCRTMC 13 427 An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm A.Veeraswamy

More information

WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1

WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1 WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1 H. Altay Güvenir and Aynur Akkuş Department of Computer Engineering and Information Science Bilkent University, 06533, Ankara, Turkey

More information

Data with Missing Attribute Values: Generalization of Indiscernibility Relation and Rule Induction

Data with Missing Attribute Values: Generalization of Indiscernibility Relation and Rule Induction Data with Missing Attribute Values: Generalization of Indiscernibility Relation and Rule Induction Jerzy W. Grzymala-Busse 1,2 1 Department of Electrical Engineering and Computer Science, University of

More information

Available online at ScienceDirect. Procedia Computer Science 96 (2016 )

Available online at   ScienceDirect. Procedia Computer Science 96 (2016 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 96 (2016 ) 179 186 20th International Conference on Knowledge Based and Intelligent Information and Engineering Systems,

More information

Improving Classifier Performance by Imputing Missing Values using Discretization Method

Improving Classifier Performance by Imputing Missing Values using Discretization Method Improving Classifier Performance by Imputing Missing Values using Discretization Method E. CHANDRA BLESSIE Assistant Professor, Department of Computer Science, D.J.Academy for Managerial Excellence, Coimbatore,

More information

Mining High Order Decision Rules

Mining High Order Decision Rules Mining High Order Decision Rules Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 e-mail: yyao@cs.uregina.ca Abstract. We introduce the notion of high

More information

Efficient Rule Set Generation using K-Map & Rough Set Theory (RST)

Efficient Rule Set Generation using K-Map & Rough Set Theory (RST) International Journal of Engineering & Technology Innovations, Vol. 2 Issue 3, May 2015 www..com 6 Efficient Rule Set Generation using K-Map & Rough Set Theory (RST) Durgesh Srivastava 1, Shalini Batra

More information

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate

More information

Minimal Test Cost Feature Selection with Positive Region Constraint

Minimal Test Cost Feature Selection with Positive Region Constraint Minimal Test Cost Feature Selection with Positive Region Constraint Jiabin Liu 1,2,FanMin 2,, Shujiao Liao 2, and William Zhu 2 1 Department of Computer Science, Sichuan University for Nationalities, Kangding

More information

An Empirical Study on feature selection for Data Classification

An Empirical Study on feature selection for Data Classification An Empirical Study on feature selection for Data Classification S.Rajarajeswari 1, K.Somasundaram 2 Department of Computer Science, M.S.Ramaiah Institute of Technology, Bangalore, India 1 Department of

More information

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization

More information

Rank Measures for Ordering

Rank Measures for Ordering Rank Measures for Ordering Jin Huang and Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 email: fjhuang33, clingg@csd.uwo.ca Abstract. Many

More information

Challenges and Interesting Research Directions in Associative Classification

Challenges and Interesting Research Directions in Associative Classification Challenges and Interesting Research Directions in Associative Classification Fadi Thabtah Department of Management Information Systems Philadelphia University Amman, Jordan Email: FFayez@philadelphia.edu.jo

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

A Rough Set Approach to Data with Missing Attribute Values

A Rough Set Approach to Data with Missing Attribute Values A Rough Set Approach to Data with Missing Attribute Values Jerzy W. Grzymala-Busse Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA and Institute

More information

A Closest Fit Approach to Missing Attribute Values in Preterm Birth Data

A Closest Fit Approach to Missing Attribute Values in Preterm Birth Data A Closest Fit Approach to Missing Attribute Values in Preterm Birth Data Jerzy W. Grzymala-Busse 1, Witold J. Grzymala-Busse 2, and Linda K. Goodwin 3 1 Department of Electrical Engineering and Computer

More information

Diagnosis of Melanoma Based on Data Mining and ABCD Formulas

Diagnosis of Melanoma Based on Data Mining and ABCD Formulas Diagnosis of Melanoma Based on Data Mining and ABCD Formulas Stanislaw BAJCAR Regional Dermatology Center, 35-310 Rzeszow, Poland Jerzy W. GRZYMALA-BUSSE Department of Electrical Engineering and Computer

More information

Discretizing Continuous Attributes Using Information Theory

Discretizing Continuous Attributes Using Information Theory Discretizing Continuous Attributes Using Information Theory Chang-Hwan Lee Department of Information and Communications, DongGuk University, Seoul, Korea 100-715 chlee@dgu.ac.kr Abstract. Many classification

More information

C-NBC: Neighborhood-Based Clustering with Constraints

C-NBC: Neighborhood-Based Clustering with Constraints C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining

More information

h=[3,2,5,7], pos=[2,1], neg=[4,4]

h=[3,2,5,7], pos=[2,1], neg=[4,4] 2D1431 Machine Learning Lab 1: Concept Learning & Decision Trees Frank Hoffmann e-mail: hoffmann@nada.kth.se November 8, 2002 1 Introduction You have to prepare the solutions to the lab assignments prior

More information

BRACE: A Paradigm For the Discretization of Continuously Valued Data

BRACE: A Paradigm For the Discretization of Continuously Valued Data Proceedings of the Seventh Florida Artificial Intelligence Research Symposium, pp. 7-2, 994 BRACE: A Paradigm For the Discretization of Continuously Valued Data Dan Ventura Tony R. Martinez Computer Science

More information

Efficient Case Based Feature Construction

Efficient Case Based Feature Construction Efficient Case Based Feature Construction Ingo Mierswa and Michael Wurst Artificial Intelligence Unit,Department of Computer Science, University of Dortmund, Germany {mierswa, wurst}@ls8.cs.uni-dortmund.de

More information

Modeling the Real World for Data Mining: Granular Computing Approach

Modeling the Real World for Data Mining: Granular Computing Approach Modeling the Real World for Data Mining: Granular Computing Approach T. Y. Lin Department of Mathematics and Computer Science San Jose State University San Jose California 95192-0103 and Berkeley Initiative

More information

Correlation Based Feature Selection with Irrelevant Feature Removal

Correlation Based Feature Selection with Irrelevant Feature Removal Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Classification with Diffuse or Incomplete Information

Classification with Diffuse or Incomplete Information Classification with Diffuse or Incomplete Information AMAURY CABALLERO, KANG YEN Florida International University Abstract. In many different fields like finance, business, pattern recognition, communication

More information

An Information-Theoretic Approach to the Prepruning of Classification Rules

An Information-Theoretic Approach to the Prepruning of Classification Rules An Information-Theoretic Approach to the Prepruning of Classification Rules Max Bramer University of Portsmouth, Portsmouth, UK Abstract: Keywords: The automatic induction of classification rules from

More information

Dynamic Clustering of Data with Modified K-Means Algorithm

Dynamic Clustering of Data with Modified K-Means Algorithm 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining

More information

Chapter 12 Feature Selection

Chapter 12 Feature Selection Chapter 12 Feature Selection Xiaogang Su Department of Statistics University of Central Florida - 1 - Outline Why Feature Selection? Categorization of Feature Selection Methods Filter Methods Wrapper Methods

More information

Dynamic Tabu Search for Dimensionality Reduction in Rough Set

Dynamic Tabu Search for Dimensionality Reduction in Rough Set Dynamic Tabu Search for Dimensionality Reduction in Rough Set ZALINDA OTHMAN, AZURALIZA ABU BAKAR, SALWANI ABDULLAH, MOHD ZAKREE AHMAD NAZRI AND NELLY ANAK SENGALANG Faculty Of Information Sciences and

More information

Min-Uncertainty & Max-Certainty Criteria of Neighborhood Rough- Mutual Feature Selection

Min-Uncertainty & Max-Certainty Criteria of Neighborhood Rough- Mutual Feature Selection Information Technology Min-Uncertainty & Max-Certainty Criteria of Neighborhood Rough- Mutual Feature Selection Sombut FOITHONG 1,*, Phaitoon SRINIL 1, Ouen PINNGERN 2 and Boonwat ATTACHOO 3 1 Faculty

More information

Formal Concept Analysis and Hierarchical Classes Analysis

Formal Concept Analysis and Hierarchical Classes Analysis Formal Concept Analysis and Hierarchical Classes Analysis Yaohua Chen, Yiyu Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: {chen115y, yyao}@cs.uregina.ca

More information

Data Analysis and Mining in Ordered Information Tables

Data Analysis and Mining in Ordered Information Tables Data Analysis and Mining in Ordered Information Tables Ying Sai, Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Ning Zhong

More information

Feature Selection with Decision Tree Criterion

Feature Selection with Decision Tree Criterion Feature Selection with Decision Tree Criterion Krzysztof Grąbczewski and Norbert Jankowski Department of Computer Methods Nicolaus Copernicus University Toruń, Poland kgrabcze,norbert@phys.uni.torun.pl

More information

RECORD-TO-RECORD TRAVEL ALGORITHM FOR ATTRIBUTE REDUCTION IN ROUGH SET THEORY

RECORD-TO-RECORD TRAVEL ALGORITHM FOR ATTRIBUTE REDUCTION IN ROUGH SET THEORY RECORD-TO-RECORD TRAVEL ALGORITHM FOR ATTRIBUTE REDUCTION IN ROUGH SET THEORY MAJDI MAFARJA 1,2, SALWANI ABDULLAH 1 1 Data Mining and Optimization Research Group (DMO), Center for Artificial Intelligence

More information

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA Obj ti Objectives Motivation: Why preprocess the Data? Data Preprocessing Techniques Data Cleaning Data Integration and Transformation Data Reduction Data Preprocessing Lecture 3/DMBI/IKI83403T/MTI/UI

More information

Efficient Pairwise Classification

Efficient Pairwise Classification Efficient Pairwise Classification Sang-Hyeun Park and Johannes Fürnkranz TU Darmstadt, Knowledge Engineering Group, D-64289 Darmstadt, Germany Abstract. Pairwise classification is a class binarization

More information

Fuzzy Entropy based feature selection for classification of hyperspectral data

Fuzzy Entropy based feature selection for classification of hyperspectral data Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineering NIT Kurukshetra, 136119 mpce_pal@yahoo.co.uk Abstract: This paper proposes to use

More information

On Generalizing Rough Set Theory

On Generalizing Rough Set Theory On Generalizing Rough Set Theory Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Abstract. This paper summarizes various formulations

More information

Data Analytics and Boolean Algebras

Data Analytics and Boolean Algebras Data Analytics and Boolean Algebras Hans van Thiel November 28, 2012 c Muitovar 2012 KvK Amsterdam 34350608 Passeerdersstraat 76 1016 XZ Amsterdam The Netherlands T: + 31 20 6247137 E: hthiel@muitovar.com

More information

Comparisons on Different Approaches to Assign Missing Attribute Values

Comparisons on Different Approaches to Assign Missing Attribute Values Comparisons on Different Approaches to Assign Missing Attribute Values Jiye Li 1 and Nick Cercone 2 1 School of Computer Science, University of Waterloo 200 University Avenue West, Waterloo, Ontario, Canada

More information

A Divide-and-Conquer Discretization Algorithm

A Divide-and-Conquer Discretization Algorithm A Divide-and-Conquer Discretization Algorithm Fan Min, Lijun Xie, Qihe Liu, and Hongbin Cai College of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu

More information

Efficiently Handling Feature Redundancy in High-Dimensional Data

Efficiently Handling Feature Redundancy in High-Dimensional Data Efficiently Handling Feature Redundancy in High-Dimensional Data Lei Yu Department of Computer Science & Engineering Arizona State University Tempe, AZ 85287-5406 leiyu@asu.edu Huan Liu Department of Computer

More information

Han Liu, Alexander Gegov & Mihaela Cocea

Han Liu, Alexander Gegov & Mihaela Cocea Rule-based systems: a granular computing perspective Han Liu, Alexander Gegov & Mihaela Cocea Granular Computing ISSN 2364-4966 Granul. Comput. DOI 10.1007/s41066-016-0021-6 1 23 Your article is published

More information

SSV Criterion Based Discretization for Naive Bayes Classifiers

SSV Criterion Based Discretization for Naive Bayes Classifiers SSV Criterion Based Discretization for Naive Bayes Classifiers Krzysztof Grąbczewski kgrabcze@phys.uni.torun.pl Department of Informatics, Nicolaus Copernicus University, ul. Grudziądzka 5, 87-100 Toruń,

More information

Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms

Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms Remco R. Bouckaert 1,2 and Eibe Frank 2 1 Xtal Mountain Information Technology 215 Three Oaks Drive, Dairy Flat, Auckland,

More information

Filter methods for feature selection. A comparative study

Filter methods for feature selection. A comparative study Filter methods for feature selection. A comparative study Noelia Sánchez-Maroño, Amparo Alonso-Betanzos, and María Tombilla-Sanromán University of A Coruña, Department of Computer Science, 15071 A Coruña,

More information

Hybrid Fuzzy-Rough Rule Induction and Feature Selection

Hybrid Fuzzy-Rough Rule Induction and Feature Selection Hybrid Fuzzy-Rough Rule Induction and Feature Selection Richard Jensen, Chris Cornelis and Qiang Shen Abstract The automated generation of feature patternbased if-then rules is essential to the success

More information

Improving the Random Forest Algorithm by Randomly Varying the Size of the Bootstrap Samples for Low Dimensional Data Sets

Improving the Random Forest Algorithm by Randomly Varying the Size of the Bootstrap Samples for Low Dimensional Data Sets Improving the Random Forest Algorithm by Randomly Varying the Size of the Bootstrap Samples for Low Dimensional Data Sets Md Nasim Adnan and Md Zahidul Islam Centre for Research in Complex Systems (CRiCS)

More information

CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION

CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called cluster)

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Features and Patterns The Curse of Size and

More information

Fig 1. Overview of IE-based text mining framework

Fig 1. Overview of IE-based text mining framework DiscoTEX: A framework of Combining IE and KDD for Text Mining Ritesh Kumar Research Scholar, Singhania University, Pacheri Beri, Rajsthan riteshchandel@gmail.com Abstract: Text mining based on the integration

More information

A FAST CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM

A FAST CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM A FAST CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM Akshay S. Agrawal 1, Prof. Sachin Bojewar 2 1 P.G. Scholar, Department of Computer Engg., ARMIET, Sapgaon, (India) 2 Associate Professor, VIT,

More information

Clustering of Data with Mixed Attributes based on Unified Similarity Metric

Clustering of Data with Mixed Attributes based on Unified Similarity Metric Clustering of Data with Mixed Attributes based on Unified Similarity Metric M.Soundaryadevi 1, Dr.L.S.Jayashree 2 Dept of CSE, RVS College of Engineering and Technology, Coimbatore, Tamilnadu, India 1

More information

ROUGH SETS THEORY AND UNCERTAINTY INTO INFORMATION SYSTEM

ROUGH SETS THEORY AND UNCERTAINTY INTO INFORMATION SYSTEM ROUGH SETS THEORY AND UNCERTAINTY INTO INFORMATION SYSTEM Pavel Jirava Institute of System Engineering and Informatics Faculty of Economics and Administration, University of Pardubice Abstract: This article

More information

A Modified Chi2 Algorithm for Discretization

A Modified Chi2 Algorithm for Discretization 666 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 14, NO. 3, MAY/JUNE 2002 A Modified Chi2 Algorithm for Discretization Francis E.H. Tay, Member, IEEE, and Lixiang Shen AbstractÐSince the ChiMerge

More information

ROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING WITH UNCERTAINTY

ROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING WITH UNCERTAINTY ALGEBRAIC METHODS IN LOGIC AND IN COMPUTER SCIENCE BANACH CENTER PUBLICATIONS, VOLUME 28 INSTITUTE OF MATHEMATICS POLISH ACADEMY OF SCIENCES WARSZAWA 1993 ROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING

More information

Determination of Similarity Threshold in Clustering Problems for Large Data Sets

Determination of Similarity Threshold in Clustering Problems for Large Data Sets Determination of Similarity Threshold in Clustering Problems for Large Data Sets Guillermo Sánchez-Díaz 1 and José F. Martínez-Trinidad 2 1 Center of Technologies Research on Information and Systems, The

More information

Machine Learning Techniques for Data Mining

Machine Learning Techniques for Data Mining Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already

More information

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control. What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem

More information

Uncertain Data Classification Using Decision Tree Classification Tool With Probability Density Function Modeling Technique

Uncertain Data Classification Using Decision Tree Classification Tool With Probability Density Function Modeling Technique Research Paper Uncertain Data Classification Using Decision Tree Classification Tool With Probability Density Function Modeling Technique C. Sudarsana Reddy 1 S. Aquter Babu 2 Dr. V. Vasu 3 Department

More information

CS Machine Learning

CS Machine Learning CS 60050 Machine Learning Decision Tree Classifier Slides taken from course materials of Tan, Steinbach, Kumar 10 10 Illustrating Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K

More information

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 1

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (  1 Cluster Based Speed and Effective Feature Extraction for Efficient Search Engine Manjuparkavi A 1, Arokiamuthu M 2 1 PG Scholar, Computer Science, Dr. Pauls Engineering College, Villupuram, India 2 Assistant

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

Temporal Weighted Association Rule Mining for Classification

Temporal Weighted Association Rule Mining for Classification Temporal Weighted Association Rule Mining for Classification Purushottam Sharma and Kanak Saxena Abstract There are so many important techniques towards finding the association rules. But, when we consider

More information

A Parallel Evolutionary Algorithm for Discovery of Decision Rules

A Parallel Evolutionary Algorithm for Discovery of Decision Rules A Parallel Evolutionary Algorithm for Discovery of Decision Rules Wojciech Kwedlo Faculty of Computer Science Technical University of Bia lystok Wiejska 45a, 15-351 Bia lystok, Poland wkwedlo@ii.pb.bialystok.pl

More information

Data Collection, Preprocessing and Implementation

Data Collection, Preprocessing and Implementation Chapter 6 Data Collection, Preprocessing and Implementation 6.1 Introduction Data collection is the loosely controlled method of gathering the data. Such data are mostly out of range, impossible data combinations,

More information

Using Decision Boundary to Analyze Classifiers

Using Decision Boundary to Analyze Classifiers Using Decision Boundary to Analyze Classifiers Zhiyong Yan Congfu Xu College of Computer Science, Zhejiang University, Hangzhou, China yanzhiyong@zju.edu.cn Abstract In this paper we propose to use decision

More information

Decision Tree CE-717 : Machine Learning Sharif University of Technology

Decision Tree CE-717 : Machine Learning Sharif University of Technology Decision Tree CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adapted from: Prof. Tom Mitchell Decision tree Approximating functions of usually discrete

More information

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data e.g., occupation = noisy: containing

More information

A Logic Language of Granular Computing

A Logic Language of Granular Computing A Logic Language of Granular Computing Yiyu Yao and Bing Zhou Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yyao, zhou200b}@cs.uregina.ca Abstract Granular

More information

Data Mining. 3.2 Decision Tree Classifier. Fall Instructor: Dr. Masoud Yaghini. Chapter 5: Decision Tree Classifier

Data Mining. 3.2 Decision Tree Classifier. Fall Instructor: Dr. Masoud Yaghini. Chapter 5: Decision Tree Classifier Data Mining 3.2 Decision Tree Classifier Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Basic Algorithm for Decision Tree Induction Attribute Selection Measures Information Gain Gain Ratio

More information

Feature Selection Using Modified-MCA Based Scoring Metric for Classification

Feature Selection Using Modified-MCA Based Scoring Metric for Classification 2011 International Conference on Information Communication and Management IPCSIT vol.16 (2011) (2011) IACSIT Press, Singapore Feature Selection Using Modified-MCA Based Scoring Metric for Classification

More information

The Role of Biomedical Dataset in Classification

The Role of Biomedical Dataset in Classification The Role of Biomedical Dataset in Classification Ajay Kumar Tanwani and Muddassar Farooq Next Generation Intelligent Networks Research Center (nexgin RC) National University of Computer & Emerging Sciences

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

UNIT 2 Data Preprocessing

UNIT 2 Data Preprocessing UNIT 2 Data Preprocessing Lecture Topic ********************************************** Lecture 13 Why preprocess the data? Lecture 14 Lecture 15 Lecture 16 Lecture 17 Data cleaning Data integration and

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2012 http://ce.sharif.edu/courses/90-91/2/ce725-1/ Agenda Features and Patterns The Curse of Size and

More information

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target

More information

DATA ANALYSIS I. Types of Attributes Sparse, Incomplete, Inaccurate Data

DATA ANALYSIS I. Types of Attributes Sparse, Incomplete, Inaccurate Data DATA ANALYSIS I Types of Attributes Sparse, Incomplete, Inaccurate Data Sources Bramer, M. (2013). Principles of data mining. Springer. [12-21] Witten, I. H., Frank, E. (2011). Data Mining: Practical machine

More information