Consistency Based Attribute Reduction
|
|
- Edgar Beasley
- 6 years ago
- Views:
Transcription
1 Consistency ased Attribute eduction inghua Hu, Hui Zhao, Zongxia Xie, and aren Yu Harbin nstitute of Technology, Harbin , P China huqinghua@hcmshiteducn ough sets are widely used in feature subset selection and attribute reduction n most of the existing algorithms, the dependency function is employed to evaluate the quality of a feature subset The disadvantages of using dependency are discussed in this paper And the problem of forward greedy search algorithm based on dependency is presented We introduce the consistency measure to deal with the problems The relationship between dependency and consistency is analyzed t is shown that consistency measure can reflects not only the size of decision positive region, like dependency, but also the sample distribution in the boundary region Therefore it can more finely describe the distinguishing power of an attribute set ased on consistency, we redefine the redundancy and reduct of a decision system We construct a forward greedy search algorithm to find reducts based on consistency What s more, we employ cross validation to test the selected features, and reduce the overfitting features in a reduct The experimental results with UC data show that the proposed algorithm is effective and efficient 1 ntroduction As the capability of gathering and storing data increases, there are a lot of candidate features in some pattern recognition and machine learning tasks Applications show that excessive features will not only significantly slow down the learning process, but also decrease the generalization power of the learned classifiers Attribute reduction, also called feature subset selection, is usually employed as a preprocessing step to select part of the features and focuses the learning algorithm on the relevant information [1, 3, 4, 5, 7, 8] n recent years, rough set theory has been widely discussed and used in attribute reduction and feature selection [6, 7, 8, 14, 16, 17] educt is a proper term in rough set methodology t means a minimal attribute subset with the same approximating power as the whole set [14] This definition shows that a reduct should have the least redundant information and not loss the classification ability of the raw data Thus the attributes in a reduct should not only be strongly relevant to the learning task, but also be not redundant with each other This property of reducts exactly accords with the objective of feature selection Thereby, the process of searching reducts, called attribute reduction, is a feature subset selection process As so far, a series of approaches to search reducts have been published iscernibility Matrices [11, 14] were introduced to store the features which can distinguish the corresponding pair of objects, and then oolean operations were conducted on the matrices to search all of the reducts The main problem of this method is space and Z-H Zhou, H i, and Yang (Eds): PAK 2007, NA 4426, pp , 2007 Springer-erlag erlin Heidelberg 2007
2 Consistency ased Attribute eduction 97 time cost We need a 10 h0 matrix if there are 10 4 samples What s more, it is also time-consuming to search reducts from the matrix with oolean operations With the dependency function, a heuristic search algorithm was constructed [1, 6, 7, 8, 16] There are some problems in dependency based attribute reduction The dependency function in rough set approaches is the ratio of sizes of the positive region over the sample space The positive region is the sample set which can be undoubtedly classified into a certain class according to the existing attributes rom the definition of the dependency function, we can find that it ignores the influence of boundary samples, which maybe belong to more than one class However, in classification learning, the boundary samples also exert an influence on the learned results or example, in learning decision trees with CAT or C45 learning, the samples in leaf nodes sometimes belong to more than one class [2, 10] n this case, the nodes are labeled with the class with majority of samples However, the dependency function does not take this kind of samples into account What s more, there is another risk in using the dependency function in greedy feature subset search algorithms n a forward greedy search, we usually start with an empty set of attribute, and then we add the selected features into the reduct one by one n the first round, we need to compute the dependency of each single attribute, and select the attribute with the greatest dependency value We find that the greatest dependency of a single attribute is zero in some applications because we can not classify any of the samples beyond dispute with any of the candidate features Therefore, according to the criterion that the dependency function should be greater than zero, none of the attributes can be selected Then the feature selection algorithm can find nothing However, some combinations of the attributes are able to distinguish any of the samples although a single one cannot distinguish any of them As much as we know, there is no research reporting on this issue so far These issues essentially result from the same problem of the dependency function t completely neglects the boundary samples n this paper, we will introduce a new function, proposed by ash and iu [3], called consistency, to evaluate the significance of attributes We discuss the relationship between dependency and consistency, and employ the consistency function to construct greedy search attribute reduction algorithm The main difference between the two functions is in considering the boundary samples Consistency not only computes the positive region, but also the samples of the majority class in boundary regions Therefore, even if the positive region is empty, we can still compare the distinguishing power of the features according to the sample distribution in boundary regions Consistency is the ratio of consistent samples; hence it is linear with the size of consistent samples Therefore it is easy to specify a stopping criterion in a consistency-based algorithm With numerical experiments, we will show the specification is necessary for real-world applications n the next section, we review the basic concepts on rough sets We then present the definition and properties of the consistency function, compare the dependency function with consistency, and construct consistency based attribute reduction in section 3We present the results of experiments in section 4 inally, the conclusions are presented in section 5
3 98 Hu et al 2 asic Concepts on ough Sets ough set theory, which was introduced to deal with imperfect and vague concepts, has Wtracted a lot of attention from theory and application research areas ata sets are usually given as the form of tables, we call a data table as an information system, formulated as S =< U, A,, f >, where U = { x1, x2, xn} is a set of finite and nonempty objects, called the universe, A is the set of attributes characterizing the objects, is the domain of attribute value and f is the information function f : U A f the attribute set is divided into condition attribute set C and decision attribute set, the information system is also called a decision table With arbitrary attribute subset A, there is an indiscernibility relation N () : N ( ) = { < x, y > U U a, a( x) = a( y)} < x, y > N( ) means objects x and y are indiscernible with respect to attribute set Obviously, indiscernibility relation is an equivalent relation, which satisfies the properties of reflexivity, symmetry and transitivity The equivalenw class induced by the attributes is denoted by [ xi ] = { x < xi, x > N( ), y U} Equivalent classes generated by are also called -elemental granules, - information granules The set of elemental granules forms a concept system, which is used to characterize the imperfect concepts in the information system Given an arbitrary concept X in the information system, two unions of elemental granules are associated with X = {[ x] [ x] X, x U}, X = {[ x] [ x] X, x U} The concept X is approximated with the two sets of elemental granules X and X are called lower and upper approximations of X in terms of attributes X is also called the positive region X is a definable if X = X, which means the concept X can be perfectly characterized with the knowledge, otherwise, X is indefinable An indefinable set is called a rough set N( X ) = X X is called the boundary of the approximations As a definable set, the boundary is empty Given <U, C,,,f>, C and will generates two partitions of the universe Machine learning is usually involved in using condition knowledge to approximate the decision and finding the mapping from the conditions to decisions Approximating U / with U / C, the positive and boundary regions are defined as: POS C ( ) = U CX, N ( ) = U CX U CX X U / C X U / X U / The boundary region is the set of elemental granules which can not be perfectly described by the knowledge C, while the positive region is the set of C-elemental granules which completely belong to one of the decision concepts The size of positive or boundary regions reflects the approximation power of the condition
4 Consistency ased Attribute eduction 99 attributes Given a decision table, for any C, it is said the decision attribute set depends on the condition attributes with the degree k, denoted by, where POS ( ) k = γ ( ) = U The dependency function k measures the approximation power of a condition attribute set with respect to the decision n data mining, especially in feature selection, it is important to find the dependence relations between attribute sets and to find a concise and efficient representation of the data Given a decision table T =< U, C U,, f >, if P C, we have γ ( ) γ ( ) Given a decision table T =< U, C U,, f >, C, a, we say that the condition attribute a is indispensable if γ ( a) ( ) < γ ( ), otherwise we say a is redundant We say C is independent if any a in is indispensable Attribute subset is a reduct of the decision table if 1) γ ( ) = γ C ( ) ; 2) a : γ ( ) > γ ( ) a A reduct of a decision table is the attribute subset which keeps the approximating capability of all the condition attributes n the meantime it has no redundant attribute The term of reduct presents a concise and complete ways to define the objective of feature selection and attribute reduction P k 3 Consistency ased Attribute eduction A binary classification problem in discrete spaces is shown in igure 1, where the samples are divided into a finite set of equivalence classes { E1, E2,, EK } based on their feature values The samples with the same feature values are grouped into one equivalence class We find that some of the equivalence classes are pure as their samples belong to one of decision classes, but there also are some inconsistent equivalence classes, such as E 3 and E 4 in figure1 According to rough set theory, they are named as decision boundary region, and the set of consistent equivalence classes is named as decision positive region The objective of feature selection is to find a feature subset which minimizes the inconsistent region, in either discrete or numerical cases, accordingly, minimizes ayesian decision error t is therefore desirable to have a measure to reflect the size of inconsistent region for discrete and numerical spaces for feature selection ependency reflects the ratio of consistent samples over the whole set of samples Therefore dependency doesn t take the boundary samples into account in computing significance of attributes Once there are inconsistent samples in an equivalence class, these equivalence classes are just ignored However, inconsistent samples can be divided into two groups: a subset of samples under the majority class and a subset under the minority classes According
5 100 Hu et al p ( E1 ω1 ) p ( E 2 ω 1 ) p ( E 3 ω 2 ) p( E 4 ω 2 ) p( E5 ω2) p( E6 ω2 ) p ( E1 ω1 ) p ( E 2 ω 1 ) p ( E 3 ω 2 ) p( E 4 ω 2 ) p( E5 ω2) p( E6 ω2) E1 E2 E3 E4 E5 E6 p E3 ω ) p E4 ω ) ( 1 ( 1 E1 E2 E3 E4 E5 E6 p E3 ω ) p E4 ω ) ( 1 (1) (2) ig 1 Classification complexity in a discrete feature space ( 1 to ayesian rule, only the samples under the minority classes are misclassified ox example, the samples in E 3 and E 4 are inconsistent in figure 1 ut only the samples labeled with P ( E 3 ω2 ) and P ( E 4 ω1) are misclassified The classification power in this case can be given by f = 1 [ P( E3 ω2 ) P( E3 ) P( E4 ω1) P( E4 )] ependency can not reflect the true classification complexity n the discrete cases, we can see from comparison of figure 1 (1) and (2) although the probabilities of inconsistent samples are identical, the probabilities of misclassification are different ependency function in rough sets can not reflect this difference n [3], ash and iu introduced the consistency function which can measure the difference Now we present the basic definition on consistency Consistency measure is defined by inconsistency rate, computed as follows efinition 1 A pattern is considered to be inconsistent if there are at least two objects such that they match the whole condition attribute set but are with different decision label efinition 2 The inconsistency count ξ i for a pattern p i of feature subset is the number of times it appears in the data minus the largest number among different class labels efinition 3 The inconsistency rate of a feature subset is the sum, ξ i, of all the inconsistency counts over all patterns of the feature subset that appears in data divided by U, the size of all samples, namely ξ i / U Correspondingly, consistency is computed as = ( U ξi )/ U δ ased on the above analysis, we can understand that dependency is the ratio of samples undoubtedly correctly classified, and consistency is the ratio of samples probably correctly classified There are two kinds of samples in POS ( ) U M POS () is the set of consistent samples, while M is the set of the samples with the largest number among different class labels in the boundary region n the paper, we will call M pseudoconsistent samples
6 Consistency ased Attribute eduction 101 Property 1: Given a decision table <U, C,, f>, C, we have 0 δ ( ) 1, γ ( ) δ ( ) Property 2 (monotonicity): Given a decision table <U, C,, f>, if we have δ ) δ ( ) 1( 2 Property 3: Given a decision table <U, C,, f>, if and only if namely, the table is consistent, we have δ ( ) = γ ( ) = 1 C C 1 2, U / C U /, efinition 4 Given a decision table T =< U, C U,, f >, C, a, we say condition attribute a is indispensable in if δ ( a) ( ) < δ ( ), otherwise; we say a is redundant We say C is independent if any attribute a in is indispensable δ () reflects not only the size of positive regions, but also the distribution of boundary samples The attribute is said to be redundant if the consistency doesn t decrease when we delete it Here the term redundant has two meanings The first one is relevant but redundant, the same as the meaning in general literatures [6, 7, 8, 14, 16, 17] The second meaning is irrelevant So consistency can detect the two kinds of superfluous attributes [3] efinition 5 Attribute subset is a consistency-based reduct of the decision table if (1) δ ( ) = δ C ( ) ; (2) a : δ ( ) > δ ( ) a n this definition, the first term guarantees the reduct has the same distinguishing ability as the whole set of features; the second one guarantees that all of the attributes in the reduct are indispensable Therefore, there is not any superfluous attribute in the reduct inding the optimal subset of features is a NP-hard problem We require evaluating N 2 1 combinations of features for find the optimal subset if there are N features in the decision table Considering computational complexity, here we construct a forward greedy search algorithm based on the consistency function We start with an empty set of attribute, and add one attribute into the reduct in a round The selected attribute should make the increment of consistency maximal Knowing attribute subset, we evaluate the significance of an attribute a as SG( a,, ) = δ U ( ) δ ( ) SG ( a,, ) is the increment of consistency by introducing the new attribute a in the condition of The measure is linear with the size of the new consistent and pseudo-consistent samples ormally, a forward greedy reduction algorithm based on consistency can be formulated as follows a
7 102 Hu et al Algorithm: Greedy eduction Algorithm based on Consistency nput: ecision table < U, C U d, f > Output: One reduct red Step 1: red ; // red is the pool to contain the selected attributes Step 2: or each a i A red Compute SG( a, red, ) δ ( ) δ ( ) end Step 3: select the attribute SG i = redu a i red ak which satisfies: ( a, red, ) max( SG( a, red, k = Step 4: f SG( a k, red, ) > 0, red U ak red go to step2 else return red Step 5: end i n the first round, we start with an empty set, then specify δ ( ) = 0 n this algorithm, we generate attribute subsets with a semi-exhaustive search Namely, we evaluate all of the rest attributes in each round with the consistency function, and select the feature producing the maximal significance The algorithm stops when adding any of the rest attributes will not bring increment of consistency value n realworld applications, we can stop the algorithm if the increment of consistency is less than a given threshold to avoiding the over-fitting problem n section 4, we will discuss this problem in detail The output of the algorithm is a reduced decision table The irrelevant, relevant and redundant attributes are deleted from the system The output results will be validated with two popular learning algorithms: CAT and SM, in section 4 y employing a hashing mechanism, we can compute the inconsistency rate approximately with a time complexity of O ( U ) [3] n the worst case the whole computational complexity of the algorithm can be computed as U C + U ( C 1) + + U = ( C + 1) C U / 2 i )) 4 Experimental Analysis There are two main objectives to conduct the experiments irst, we compare the proposed method with dependency based algorithm Second, we study the classification performance of the attributes selected with the proposed algorithm, n particular, how the classification accuracy varies with adding a new feature This can tell us where the algorithm should be stopped We download data sets from UC epository of machine learning databases The data sets are described in table 1 There are some numerical attributes in the data sets Here we employ four discretization techniques to transform the numerical data into
8 Consistency ased Attribute eduction 103 Table 1 ata description ata set Abbreviation Samples eatures Classes Australian Credit Approval Crd Ecoli Ecoli Heart disease Heart onosphere ono Sonar, Mines vs ocks Sonar Wisconsin iagnostic reast Cancer WC Wisconsin Prognostic reast Cancer WPC Wine recognition Wine categorical one: equal-width, equal-frequency, CM and entropy Then we conduct the dependency based algorithm [8] and the proposed one on the discretized data sets The numbers of the selected features are presented in table 2 rom table 2, we can find there is a great problem with dependency based algorithm, where, P stands for dependency based algorithm, and C stands for consistency based algorithm The algorithm selects two few feature for classification learning as to some data sets As to the discretized data with Equal-width method, the dependency based algorithm only selects one attribute, while the consistency one selects 7 attributes As to Equal-frequency method, the dependency based algorithm selects nothing for data sets Heart, Sonar and WPC The similar case occurs to Entropy and CM based discretization methods Obviously, the results are unacceptable if a feature selection algorithm cannot find anything y contrast, the consistency based attribute reduction algorithm finds feature subsets with moderate sizes for all of the data sets What s more, the sizes of selected features with the two algorithms are comparable if the dependency algorithm works well Why does the dependency based algorithm find nothing for some data sets? As we know, dependency just reflects the ratio of positive regions The forward greedy algorithm starts off with an empty set and adds, in turn, one of the best attributes into the pool at a time, those attributes that result in the greatest increase in the dependency function, until this produces its maximum possible value for the data set n the first turn, we need to evaluate each single attribute or some data sets, the dependency is zero for each single attribute Therefore, no attribute can be added into the pool in the first turn Then the algorithm stops here Sometimes, the algorithm can Table 2 The numbers of selected features with different methods aw Equal-width Equal-frequency Entropy CM data P C P C P C P C Crd Ecoli Heart ono Sonar WC WPC Wine Aver
9 104 Hu et al also stop in the second turn or the third turn However, the selected features are not enough for classification learning Consistency can overcome this problem as it can reflect the change in distribution of boundary samples Now we use the selected data to train classifiers with CAT and SM learning algorithms We test the classification power of the selected data with 10-fold cross validation The average classification accuracies with CAT and SM are presented in tables 3 and 4, respectively rom table 3, we can find most of the reduced data can keep, even improve the classification power if the numbers of selected attributes are appropriate although most of the candidate features are deleted from the data t shows that most of the features in the data sets are irrelevant or redundant for training decision trees; thereby, it should be deleted However, the classification performance will greatly decrease if the data are excessively reduced, such as iono in the equalwidth case and ecoli in the entropy and CM cases Table 3 Classification accuracy with 10-fold cross validation (CAT) aw Equal-width Equal-frequency Entropy CM data P C P C P C P C Crd Ecoli Heart ono Sonar WC WPC Wine Aver We can also find from table 4 that most of classification accuracies of reduced data decrease a little compared with the original data Correspondingly, the average classification accuracies for all of the four discretization algorithms are a little lower than the original data This shows that both dependency and consistency based feature selection algorithms are not fit for SM learning because both dependency and consistency compute the distinguishing power in discrete spaces Table 5 shows the selected features based on consistency algorithm and the corresponding turns being selected for parts of the data, where we use the CM discretized data sets The trends of consistency and classification accuracies with Table 4 Classification accuracy with 10-fold cross validation (SM) aw Equal-width Equal-frequency Entropy CM data P C P C P C Positive C Crd Ecoli Heart ono Sonar WC WPC Wine Aver
10 Consistency ased Attribute eduction 105 Table 5 The selected eatures with method CM + Consistency 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th Heart ono Sonar WC WPC W O [ H G, W O [ H G, 1XPEHUHWXUH 1XPEHUHWXUH (1)Heart (2) ono W O [ H G, 1XPEHUHWXUH W O [ H G, (3) Sonar (4) WC 1XPEHUHWXUH W O [ H G, 1XPEHUHWXUH (5) WPC &WH\ &$57XU\ 690XU\ ig 4 Trends of consistency, accuracies with CAT and SM CAT and SM are shown in figure 4 As to all of the five plots, the consistency monotonously increases with the number of selected attributes The maximal value of consistency is 1, which shows that the corresponding decision table is consistent With the selected attributes, all of the samples can be distinguished What s more, it is
11 106 Hu et al noticeable that the consistency rapidly rises at the beginning; and then slowly increases, until stops at 1 t means that the majority of samples can be distinguished with a few features, while the rest of the selected features are introduced to discern several samples This maybe leads to the over-fitting problem Therefore the algorithm should be ceased earlier or we need a pruning algorithm to delete the over-fitting features The classification accuracy curves also show this problem n figure 4, the accuracies with CAT and SM rise at first, arrive at a peak, then keep unchanged, or even decrease n terms of classification learning, it shows the features after the peak are useless They sometimes even deteriorate learning performance Here we can take two measures to overcome the problem The first one is to stop the algorithm when the increment of consistency is less than a given threshold The second one is to employ some learning algorithm to validate the selected features, and delete the features after the accuracy peak However, sometimes the first one, called prepruning method, is not feasible because we usually cannot exactly predict where the algorithm should stop The latter, called post-pruning, is widely employed n this work, cross validation are introduced to test the selected features Table 6 shows the numbers of selected features and corresponding classification accuracies We can find that the classification performance improves in most of the cases At the same time, the selected features with consistency are further reduced Especially for data sets Heart and ono, the improvement is high to 10% and 18% with CAT algorithm Table 6 Comparison of features and classification performance with post-pruning aw data CAT SM features CAT SM features Accuracy features Accuracy Heart ono Sonar WC WPC Conclusions n this paper, we introduce consistency function to overcome the problems in dependency based algorithms We discuss the relationship between dependency and consistency, and analyze the properties of consistency With the measure, the redundancy and reduct are redefined We construct a forward greedy attribute reduction algorithm based on consistency The numerical experiments show the proposed method is effective Some conclusions are shown as follows Compared with dependency, consistency can reflect not only the size of decision positive region, but also the sample distribution in boundary region Therefore, the consistency measure is able to describe the distinguishing power of an attribute set more finely than the dependency function Consistency is monotonous The consistency value increases or keeps when a new attribute is added into the attribute set What s more, some attributes are introduced into the reduct just for distinguishing a few samples f we keep these attributes in the final result, the attributes maybe overfit the data Therefore, a pruning technique is
12 Consistency ased Attribute eduction 107 required We use 10-fold cross validation to test the results in the experiments and find more effective and efficient feature subsets eference 1 hatt, Gopal M: On fuzzy-rough sets approach to feature selection Pattern ecognition etters 26 (2005) reiman, reidman J, Olshen, Stone C: Classification and regression trees California: Wadsworth nternational ash M, iu H: Consistency-based search in feature selection Artificial ntelligence 151 (2003) Guyon, Weston J, arnhill S, et al: Gene selection for cancer classification using support vector machines Machine earning 46 (2002) Guyon, Elisseeff A: An introduction to variable and feature selection Journal of Machine earning esearch 3 (2003) Hu H, i X, Yu : Analysis on Classification Performance of ough Set ased educts Yang and G Webb (Eds): PCA 2006, NA 4099, pp , 2006 Springer-erlag erlin Heidelberg 7 Hu H, Yu, Xie Z X: nformation-preserving hybrid data reduction based on fuzzy-rough techniques Pattern ecognition etters 27 (2006) Jensen, Shen : Semantics-preserving dimensionality reduction: ough and fuzzyrough-based approaches EEE transactions of knowledge and data engineering 16 (2004) iu H, Yu : Toward integrating feature selection algorithms for classification and clustering EEE Transactions on knowledge and data engineering 17 (2005) uinlan J : nduction of decision trees Machine earning 1 (1986) Skowron A, auszer C: The iscernibility Matrices and unctions in nformation Systems ntelligent ecision Support-Handbook of Applications and Advances of the ough Sets Theory, Slowinski (ed), 1991, pp Slezak : 2001 Approximate decision reducts Ph Thesis, Warsaw University 13 Ślezak : Approximate Entropy educts undamenta nformaticae 53 (2002) Swiniarski W, Skowron A: ough set methods in feature selection and recognition Pattern recognition letters 24 (2003) Xie Z X, Hu H, Yu : mproved feature selection algorithm based on SM and correlation ecture notes in computer science 3971(2006) Zhong N, ong,j, Ohsuga S: Using rough sets with heuristics for feature selection J ntelligent nformation Systems 16 (2001) Ziarko W: ariable precision rough sets model Journal of Computer and System Sciences 46 (1993) 39-59
EFFICIENT ATTRIBUTE REDUCTION ALGORITHM
EFFICIENT ATTRIBUTE REDUCTION ALGORITHM Zhongzhi Shi, Shaohui Liu, Zheng Zheng Institute Of Computing Technology,Chinese Academy of Sciences, Beijing, China Abstract: Key words: Efficiency of algorithms
More informationOn Reduct Construction Algorithms
1 On Reduct Construction Algorithms Yiyu Yao 1, Yan Zhao 1 and Jue Wang 2 1 Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {yyao, yanzhao}@cs.uregina.ca 2 Laboratory
More informationFeature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate
More informationMin-Uncertainty & Max-Certainty Criteria of Neighborhood Rough- Mutual Feature Selection
Information Technology Min-Uncertainty & Max-Certainty Criteria of Neighborhood Rough- Mutual Feature Selection Sombut FOITHONG 1,*, Phaitoon SRINIL 1, Ouen PINNGERN 2 and Boonwat ATTACHOO 3 1 Faculty
More informationFuzzy Entropy based feature selection for classification of hyperspectral data
Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineering NIT Kurukshetra, 136119 mpce_pal@yahoo.co.uk Abstract: This paper proposes to use
More informationMinimal Test Cost Feature Selection with Positive Region Constraint
Minimal Test Cost Feature Selection with Positive Region Constraint Jiabin Liu 1,2,FanMin 2,, Shujiao Liao 2, and William Zhu 2 1 Department of Computer Science, Sichuan University for Nationalities, Kangding
More informationRECORD-TO-RECORD TRAVEL ALGORITHM FOR ATTRIBUTE REDUCTION IN ROUGH SET THEORY
RECORD-TO-RECORD TRAVEL ALGORITHM FOR ATTRIBUTE REDUCTION IN ROUGH SET THEORY MAJDI MAFARJA 1,2, SALWANI ABDULLAH 1 1 Data Mining and Optimization Research Group (DMO), Center for Artificial Intelligence
More informationWhat is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.
What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem
More informationA Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values
A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values Patrick G. Clark Department of Electrical Eng. and Computer Sci. University of Kansas Lawrence,
More informationDECISION TREE INDUCTION USING ROUGH SET THEORY COMPARATIVE STUDY
DECISION TREE INDUCTION USING ROUGH SET THEORY COMPARATIVE STUDY Ramadevi Yellasiri, C.R.Rao 2,Vivekchan Reddy Dept. of CSE, Chaitanya Bharathi Institute of Technology, Hyderabad, INDIA. 2 DCIS, School
More informationEfficient Rule Set Generation using K-Map & Rough Set Theory (RST)
International Journal of Engineering & Technology Innovations, Vol. 2 Issue 3, May 2015 www..com 6 Efficient Rule Set Generation using K-Map & Rough Set Theory (RST) Durgesh Srivastava 1, Shalini Batra
More informationAn Information-Theoretic Approach to the Prepruning of Classification Rules
An Information-Theoretic Approach to the Prepruning of Classification Rules Max Bramer University of Portsmouth, Portsmouth, UK Abstract: Keywords: The automatic induction of classification rules from
More informationCyber attack detection using decision tree approach
Cyber attack detection using decision tree approach Amit Shinde Department of Industrial Engineering, Arizona State University,Tempe, AZ, USA {amit.shinde@asu.edu} In this information age, information
More informationSurvey on Rough Set Feature Selection Using Evolutionary Algorithm
Survey on Rough Set Feature Selection Using Evolutionary Algorithm M.Gayathri 1, Dr.C.Yamini 2 Research Scholar 1, Department of Computer Science, Sri Ramakrishna College of Arts and Science for Women,
More informationDecision Trees Dr. G. Bharadwaja Kumar VIT Chennai
Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target
More informationFuzzy-Rough Sets for Descriptive Dimensionality Reduction
Fuzzy-Rough Sets for Descriptive Dimensionality Reduction Richard Jensen and Qiang Shen {richjens,qiangs}@dai.ed.ac.uk Centre for Intelligent Systems and their Applications Division of Informatics, The
More informationEfficient SQL-Querying Method for Data Mining in Large Data Bases
Efficient SQL-Querying Method for Data Mining in Large Data Bases Nguyen Hung Son Institute of Mathematics Warsaw University Banacha 2, 02095, Warsaw, Poland Abstract Data mining can be understood as a
More informationData Mining. 3.2 Decision Tree Classifier. Fall Instructor: Dr. Masoud Yaghini. Chapter 5: Decision Tree Classifier
Data Mining 3.2 Decision Tree Classifier Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Basic Algorithm for Decision Tree Induction Attribute Selection Measures Information Gain Gain Ratio
More informationROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING WITH UNCERTAINTY
ALGEBRAIC METHODS IN LOGIC AND IN COMPUTER SCIENCE BANACH CENTER PUBLICATIONS, VOLUME 28 INSTITUTE OF MATHEMATICS POLISH ACADEMY OF SCIENCES WARSZAWA 1993 ROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING
More informationFeature Selection Based on Relative Attribute Dependency: An Experimental Study
Feature Selection Based on Relative Attribute Dependency: An Experimental Study Jianchao Han, Ricardo Sanchez, Xiaohua Hu, T.Y. Lin Department of Computer Science, California State University Dominguez
More informationFlexible-Hybrid Sequential Floating Search in Statistical Feature Selection
Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Petr Somol 1,2, Jana Novovičová 1,2, and Pavel Pudil 2,1 1 Dept. of Pattern Recognition, Institute of Information Theory and
More informationRough Set Approaches to Rule Induction from Incomplete Data
Proceedings of the IPMU'2004, the 10th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Perugia, Italy, July 4 9, 2004, vol. 2, 923 930 Rough
More informationSSV Criterion Based Discretization for Naive Bayes Classifiers
SSV Criterion Based Discretization for Naive Bayes Classifiers Krzysztof Grąbczewski kgrabcze@phys.uni.torun.pl Department of Informatics, Nicolaus Copernicus University, ul. Grudziądzka 5, 87-100 Toruń,
More informationClassification with Diffuse or Incomplete Information
Classification with Diffuse or Incomplete Information AMAURY CABALLERO, KANG YEN Florida International University Abstract. In many different fields like finance, business, pattern recognition, communication
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2013 http://ce.sharif.edu/courses/91-92/2/ce725-1/ Agenda Features and Patterns The Curse of Size and
More informationCollaborative Rough Clustering
Collaborative Rough Clustering Sushmita Mitra, Haider Banka, and Witold Pedrycz Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India {sushmita, hbanka r}@isical.ac.in Dept. of Electrical
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2012 http://ce.sharif.edu/courses/90-91/2/ce725-1/ Agenda Features and Patterns The Curse of Size and
More informationFeature Selection with Adjustable Criteria
Feature Selection with Adjustable Criteria J.T. Yao M. Zhang Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: jtyao@cs.uregina.ca Abstract. We present a
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Features and Patterns The Curse of Size and
More information6. Dicretization methods 6.1 The purpose of discretization
6. Dicretization methods 6.1 The purpose of discretization Often data are given in the form of continuous values. If their number is huge, model building for such data can be difficult. Moreover, many
More informationFinding Rough Set Reducts with SAT
Finding Rough Set Reducts with SAT Richard Jensen 1, Qiang Shen 1 and Andrew Tuson 2 {rkj,qqs}@aber.ac.uk 1 Department of Computer Science, The University of Wales, Aberystwyth 2 Department of Computing,
More informationAn Empirical Study of Lazy Multilabel Classification Algorithms
An Empirical Study of Lazy Multilabel Classification Algorithms E. Spyromitros and G. Tsoumakas and I. Vlahavas Department of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
More informationFeature Selection from the Perspective of Knowledge Granulation in Dynamic Set-valued Information System *
JORNAL OF INFORMATION SCIENCE AND ENGINEERING 32, 783-798 (2016) Feature Selection from the Perspective of Knowledge Granulation in Dynamic Set-valued Information System * WENBIN QIAN 1, WENHAO SH 2 AND
More informationDESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES
EXPERIMENTAL WORK PART I CHAPTER 6 DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES The evaluation of models built using statistical in conjunction with various feature subset
More information3. Data Preprocessing. 3.1 Introduction
3. Data Preprocessing Contents of this Chapter 3.1 Introduction 3.2 Data cleaning 3.3 Data integration 3.4 Data transformation 3.5 Data reduction SFU, CMPT 740, 03-3, Martin Ester 84 3.1 Introduction Motivation
More informationAttribute Reduction using Forward Selection and Relative Reduct Algorithm
Attribute Reduction using Forward Selection and Relative Reduct Algorithm P.Kalyani Associate Professor in Computer Science, SNR Sons College, Coimbatore, India. ABSTRACT Attribute reduction of an information
More informationNominal Data. May not have a numerical representation Distance measures might not make sense. PR and ANN
NonMetric Data Nominal Data So far we consider patterns to be represented by feature vectors of real or integer values Easy to come up with a distance (similarity) measure by using a variety of mathematical
More informationA Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set
A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set Renu Vashist School of Computer Science and Engineering Shri Mata Vaishno Devi University, Katra,
More informationMining High Order Decision Rules
Mining High Order Decision Rules Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 e-mail: yyao@cs.uregina.ca Abstract. We introduce the notion of high
More informationFuzzy-Rough Feature Significance for Fuzzy Decision Trees
Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Richard Jensen and Qiang Shen Department of Computer Science, The University of Wales, Aberystwyth {rkj,qqs}@aber.ac.uk Abstract Crisp decision
More informationCse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University
Cse634 DATA MINING TEST REVIEW Professor Anita Wasilewska Computer Science Department Stony Brook University Preprocessing stage Preprocessing: includes all the operations that have to be performed before
More informationCS Machine Learning
CS 60050 Machine Learning Decision Tree Classifier Slides taken from course materials of Tan, Steinbach, Kumar 10 10 Illustrating Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K
More informationRank Measures for Ordering
Rank Measures for Ordering Jin Huang and Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 email: fjhuang33, clingg@csd.uwo.ca Abstract. Many
More informationDecision Tree CE-717 : Machine Learning Sharif University of Technology
Decision Tree CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adapted from: Prof. Tom Mitchell Decision tree Approximating functions of usually discrete
More informationChapter S:II. II. Search Space Representation
Chapter S:II II. Search Space Representation Systematic Search Encoding of Problems State-Space Representation Problem-Reduction Representation Choosing a Representation S:II-1 Search Space Representation
More information7. Decision or classification trees
7. Decision or classification trees Next we are going to consider a rather different approach from those presented so far to machine learning that use one of the most common and important data structure,
More informationCOMP 465: Data Mining Classification Basics
Supervised vs. Unsupervised Learning COMP 465: Data Mining Classification Basics Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Supervised
More information2. Data Preprocessing
2. Data Preprocessing Contents of this Chapter 2.1 Introduction 2.2 Data cleaning 2.3 Data integration 2.4 Data transformation 2.5 Data reduction Reference: [Han and Kamber 2006, Chapter 2] SFU, CMPT 459
More informationA Decision-Theoretic Rough Set Model
A Decision-Theoretic Rough Set Model Yiyu Yao and Jingtao Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 {yyao,jtyao}@cs.uregina.ca Special Thanks to Professor
More informationUniversity of Florida CISE department Gator Engineering. Data Preprocessing. Dr. Sanjay Ranka
Data Preprocessing Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville ranka@cise.ufl.edu Data Preprocessing What preprocessing step can or should
More informationh=[3,2,5,7], pos=[2,1], neg=[4,4]
2D1431 Machine Learning Lab 1: Concept Learning & Decision Trees Frank Hoffmann e-mail: hoffmann@nada.kth.se November 8, 2002 1 Introduction You have to prepare the solutions to the lab assignments prior
More informationA study on lower interval probability function based decision theoretic rough set models
Annals of Fuzzy Mathematics and Informatics Volume 12, No. 3, (September 2016), pp. 373 386 ISSN: 2093 9310 (print version) ISSN: 2287 6235 (electronic version) http://www.afmi.or.kr @FMI c Kyung Moon
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationData Preprocessing. Data Preprocessing
Data Preprocessing Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville ranka@cise.ufl.edu Data Preprocessing What preprocessing step can or should
More informationAmerican International Journal of Research in Science, Technology, Engineering & Mathematics
American International Journal of Research in Science, Technology, Engineering & Mathematics Available online at http://www.iasir.net ISSN (Print): 2328-3491, ISSN (Online): 2328-3580, ISSN (CD-ROM): 2328-3629
More informationAn Incremental Algorithm to Feature Selection in Decision Systems with the Variation of Feature Set
Chinese Journal of Electronics Vol.24, No.1, Jan. 2015 An Incremental Algorithm to Feature Selection in Decision Systems with the Variation of Feature Set QIAN Wenbin 1,2, SHU Wenhao 3, YANG Bingru 2 and
More informationRough Sets, Neighborhood Systems, and Granular Computing
Rough Sets, Neighborhood Systems, and Granular Computing Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Abstract Granulation
More informationA Divide-and-Conquer Discretization Algorithm
A Divide-and-Conquer Discretization Algorithm Fan Min, Lijun Xie, Qihe Liu, and Hongbin Cai College of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu
More informationForward Feature Selection Using Residual Mutual Information
Forward Feature Selection Using Residual Mutual Information Erik Schaffernicht, Christoph Möller, Klaus Debes and Horst-Michael Gross Ilmenau University of Technology - Neuroinformatics and Cognitive Robotics
More informationDecision tree learning
Decision tree learning Andrea Passerini passerini@disi.unitn.it Machine Learning Learning the concept Go to lesson OUTLOOK Rain Overcast Sunny TRANSPORTATION LESSON NO Uncovered Covered Theoretical Practical
More informationClassification/Regression Trees and Random Forests
Classification/Regression Trees and Random Forests Fabio G. Cozman - fgcozman@usp.br November 6, 2018 Classification tree Consider binary class variable Y and features X 1,..., X n. Decide Ŷ after a series
More informationClassification. Instructor: Wei Ding
Classification Decision Tree Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Preliminaries Each data record is characterized by a tuple (x, y), where x is the attribute
More informationROUGH SETS THEORY AND UNCERTAINTY INTO INFORMATION SYSTEM
ROUGH SETS THEORY AND UNCERTAINTY INTO INFORMATION SYSTEM Pavel Jirava Institute of System Engineering and Informatics Faculty of Economics and Administration, University of Pardubice Abstract: This article
More informationECE 285 Class Project Report
ECE 285 Class Project Report Based on Source localization in an ocean waveguide using supervised machine learning Yiwen Gong ( yig122@eng.ucsd.edu), Yu Chai( yuc385@eng.ucsd.edu ), Yifeng Bu( ybu@eng.ucsd.edu
More informationDecision trees. Decision trees are useful to a large degree because of their simplicity and interpretability
Decision trees A decision tree is a method for classification/regression that aims to ask a few relatively simple questions about an input and then predicts the associated output Decision trees are useful
More informationAN EFFICIENT BINARIZATION TECHNIQUE FOR FINGERPRINT IMAGES S. B. SRIDEVI M.Tech., Department of ECE
AN EFFICIENT BINARIZATION TECHNIQUE FOR FINGERPRINT IMAGES S. B. SRIDEVI M.Tech., Department of ECE sbsridevi89@gmail.com 287 ABSTRACT Fingerprint identification is the most prominent method of biometric
More informationCS229 Lecture notes. Raphael John Lamarre Townshend
CS229 Lecture notes Raphael John Lamarre Townshend Decision Trees We now turn our attention to decision trees, a simple yet flexible class of algorithms. We will first consider the non-linear, region-based
More informationA Modular k-nearest Neighbor Classification Method for Massively Parallel Text Categorization
A Modular k-nearest Neighbor Classification Method for Massively Parallel Text Categorization Hai Zhao and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University, 1954
More informationData Mining & Feature Selection
دااگشنه رتبيت م عل م Data Mining & Feature Selection M.M. Pedram pedram@tmu.ac.ir Faculty of Engineering, Tarbiat Moallem University The 11 th Iranian Confernce on Fuzzy systems, 5-7 July, 2011 Contents
More informationMIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018
MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge
More informationNominal Data. May not have a numerical representation Distance measures might not make sense PR, ANN, & ML
Decision Trees Nominal Data So far we consider patterns to be represented by feature vectors of real or integer values Easy to come up with a distance (similarity) measure by using a variety of mathematical
More informationExtra readings beyond the lecture slides are important:
1 Notes To preview next lecture: Check the lecture notes, if slides are not available: http://web.cse.ohio-state.edu/~sun.397/courses/au2017/cse5243-new.html Check UIUC course on the same topic. All their
More informationHandling Missing Attribute Values in Preterm Birth Data Sets
Handling Missing Attribute Values in Preterm Birth Data Sets Jerzy W. Grzymala-Busse 1, Linda K. Goodwin 2, Witold J. Grzymala-Busse 3, and Xinqun Zheng 4 1 Department of Electrical Engineering and Computer
More informationSome questions of consensus building using co-association
Some questions of consensus building using co-association VITALIY TAYANOV Polish-Japanese High School of Computer Technics Aleja Legionow, 4190, Bytom POLAND vtayanov@yahoo.com Abstract: In this paper
More informationA Closest Fit Approach to Missing Attribute Values in Preterm Birth Data
A Closest Fit Approach to Missing Attribute Values in Preterm Birth Data Jerzy W. Grzymala-Busse 1, Witold J. Grzymala-Busse 2, and Linda K. Goodwin 3 1 Department of Electrical Engineering and Computer
More informationPerformance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM
Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Lu Chen and Yuan Hang PERFORMANCE DEGRADATION ASSESSMENT AND FAULT DIAGNOSIS OF BEARING BASED ON EMD AND PCA-SOM.
More informationA Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York
A Systematic Overview of Data Mining Algorithms Sargur Srihari University at Buffalo The State University of New York 1 Topics Data Mining Algorithm Definition Example of CART Classification Iris, Wine
More informationWrapper Feature Selection using Discrete Cuckoo Optimization Algorithm Abstract S.J. Mousavirad and H. Ebrahimpour-Komleh* 1 Department of Computer and Electrical Engineering, University of Kashan, Kashan,
More informationEnhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques
24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE
More informationData Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation
Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization
More informationClassification: Basic Concepts, Decision Trees, and Model Evaluation
Classification: Basic Concepts, Decision Trees, and Model Evaluation Data Warehousing and Mining Lecture 4 by Hossen Asiful Mustafa Classification: Definition Given a collection of records (training set
More informationFEATURE SELECTION TECHNIQUES
CHAPTER-2 FEATURE SELECTION TECHNIQUES 2.1. INTRODUCTION Dimensionality reduction through the choice of an appropriate feature subset selection, results in multiple uses including performance upgrading,
More informationRobustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification
Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Tomohiro Tanno, Kazumasa Horie, Jun Izawa, and Masahiko Morita University
More informationA Systematic Overview of Data Mining Algorithms
A Systematic Overview of Data Mining Algorithms 1 Data Mining Algorithm A well-defined procedure that takes data as input and produces output as models or patterns well-defined: precisely encoded as a
More informationA Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search
A Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search Jianli Ding, Liyang Fu School of Computer Science and Technology Civil Aviation University of China
More informationLecture 2 :: Decision Trees Learning
Lecture 2 :: Decision Trees Learning 1 / 62 Designing a learning system What to learn? Learning setting. Learning mechanism. Evaluation. 2 / 62 Prediction task Figure 1: Prediction task :: Supervised learning
More informationUnivariate and Multivariate Decision Trees
Univariate and Multivariate Decision Trees Olcay Taner Yıldız and Ethem Alpaydın Department of Computer Engineering Boğaziçi University İstanbul 80815 Turkey Abstract. Univariate decision trees at each
More informationCredit card Fraud Detection using Predictive Modeling: a Review
February 207 IJIRT Volume 3 Issue 9 ISSN: 2396002 Credit card Fraud Detection using Predictive Modeling: a Review Varre.Perantalu, K. BhargavKiran 2 PG Scholar, CSE, Vishnu Institute of Technology, Bhimavaram,
More informationMachine Learning Techniques for Data Mining
Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already
More informationClassification: Decision Trees
Classification: Decision Trees IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University 1 Decision Tree Example Will a pa)ent have high-risk based on the ini)al 24-hour observa)on?
More informationCOMP61011 Foundations of Machine Learning. Feature Selection
OMP61011 Foundations of Machine Learning Feature Selection Pattern Recognition: The Early Days Only 200 papers in the world! I wish! Pattern Recognition: The Early Days Using eight very simple measurements
More informationInformation Granulation and Approximation in a Decision-theoretic Model of Rough Sets
Information Granulation and Approximation in a Decision-theoretic Model of Rough Sets Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan Canada S4S 0A2 E-mail: yyao@cs.uregina.ca
More informationData Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy
Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy Lutfi Fanani 1 and Nurizal Dwi Priandani 2 1 Department of Computer Science, Brawijaya University, Malang, Indonesia. 2 Department
More informationComment Extraction from Blog Posts and Its Applications to Opinion Mining
Comment Extraction from Blog Posts and Its Applications to Opinion Mining Huan-An Kao, Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan
More informationBig Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1
Big Data Methods Chapter 5: Machine learning Big Data Methods, Chapter 5, Slide 1 5.1 Introduction to machine learning What is machine learning? Concerned with the study and development of algorithms that
More informationVariable Selection 6.783, Biomedical Decision Support
6.783, Biomedical Decision Support (lrosasco@mit.edu) Department of Brain and Cognitive Science- MIT November 2, 2009 About this class Why selecting variables Approaches to variable selection Sparsity-based
More informationGraph Matching: Fast Candidate Elimination Using Machine Learning Techniques
Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques M. Lazarescu 1,2, H. Bunke 1, and S. Venkatesh 2 1 Computer Science Department, University of Bern, Switzerland 2 School of
More informationWeka ( )
Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised
More informationPart I. Instructor: Wei Ding
Classification Part I Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Classification: Definition Given a collection of records (training set ) Each record contains a set
More informationAn Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm
Proceedings of the National Conference on Recent Trends in Mathematical Computing NCRTMC 13 427 An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm A.Veeraswamy
More information