Towards scaling up induction of second-order decision tables
|
|
- Logan Pope
- 5 years ago
- Views:
Transcription
1 Towards scaling up induction of second-order decision tables R. Hewett and J. Leuchner Institute for Human and Machine Cognition, University of West Florida, USA. Abstract One of the fundamental challenges for data mining is to enable inductive learning algorithms to operate on very large databases. Ensemble learning techniques such as bagging have been applied successfully to improve accuracy of classification models by generating multiple models, from replicate training sets, and aggregating them to form a composite model. In this paper, we adapt the bagging approach for scaling up and also study effects of data partitioning, sampling, and aggregation techniques for mining very large databases. Our recent work developed SORCER, a learning system that induces a near minimal rule set from a data set represented as a second-order decision table (a database relation in which rows have sets of atomic values as components). Despite its simplicity, experiments show that SORCER is competitive to other, state-of-theart induction systems. Here we apply SORCER using two instance subset selection procedures (random partitioning and sampling with replacement) and two aggregation procedures (majority voting and selecting the model that performs best on a validation set). We experiment with the GIS data set, fi-om the UCI KDD Repository, which contains 581,012 instances of 30x30 meter cells with 54 attributes for classi&ng forest cover types. Performance results are reported including results ftom mining the entire training data set using different compression algorithms in SORCER and published results from neural net and decision tree learners. 1 Introduction The development of inductive learning algorithms that scale up to very large data sets is a fundamental problem in data mining applications. Scalability raises
2 386 Data Mining III the issue of whether an algorithm can be efficient while building the best possible model from a very large data set. To machine learning researchers, very large usually means a data set containing at least 100,000 examples and 25 problem variables. For the KDD (knowledge discovery and data mining) community, data sizes of 100 megabytes (or about one million examples) are considered very large [8]. Although very large data sets can be dealt with by sampling, larger training sets often produce more accurate models, especially with noisy data or data sets with many special cases [11]. Efficiency and accuracy are commonly used for evaluating the effectiveness of scaling up techniques, particularly for classification algorithms. However, data mining also recognizes the importance of the ease with which the resulting models can be interpreted. In fact, it is not uncommon to run a state-of-the-art algorithm over a large data set for several hours and then discard much of the output in order to obtain less accurate but more comprehensible results [10]. Our data mining research into the use of comprehensible models for abstraction of regularities fi-om data has produced SORCER (~econd-q-der Relation Compression for Extraction of &les) [7], a learning system that 7 reduces classification rules from a data sets represented as second-order decision tables. Based on the theoretical framework presented in [9], second-order decision tables are database relations in which tuples (rows) have sets of atomic values as components (entries). Using sets of values, interpreted as disjunctions, provides compact representations that facilitate efficient management and enhance comprehensibility. SORCER s induction algorithm can be viewed as decision table compression in which a table representing training data is transformed into a shorter table of more general rules by merging rows in ways that preserve consistency with the original data. SORCER attempts to generate classifiers with a minimum number of rows. This bias toward fewer rows further facilitates comprehensibility. Despite its simplicity, experiments show that SORCER is competitive to popular state-of-the-art systems [7]. Ensemble learning such as bagging [3] has been applied successfidly to improve accuracy of classification by generating multiple models, fi-omreplicate training sets, and aggregating them to form a composite model. However, the resulting composite models can be quite large and complex. In this paper, we adapt the bagging approach for scaling up SORCER and study the effects of data partitioning, sampling, and aggregation techniques for mining very large databases. We choose bagging over boosting (another ensemble learning method) because bagging can be implemented to process concurrently thus increasing efficiency. We apply SORCER using two instance subset selection procedures (random partitioning and sampling with replacement), and two aggregation procedures (majority voting and selecting the best model that performs best on a validation set). Unlike other bagging-like approaches, here a composite model represented by second-order decision tables in SORCER can be compressed to a single shorter table. Reduction of the size of the model is one way to improve comprehensibility. We describe our experimentation with GIS (Geographic Information System) data, obtained from the UCI KDD Repository [1], which contains 581,012 instances of 30x30 meter cells with 54
3 Data Mining III 38 7 attributes for classifying forest cover types. Our experiments use a version of SORCER whose code has not been optimized for efficiency. Our objective is to investigate data partitioning techniques for scaling up second-order decision table induction. This paper reports on that preliminary effort. For completeness, we give a brief overview of SORCER in Section 2. Section 3 describes our methodology. Experiments and results are given in Section 4. Section 5 discusses related work and conclusions. 2 Second-order decision table induction system, SORCER 2.1 Definitions and terminology We use the terms table (relation) and row (tuple or rule) to refer to the secondorder structures which we now define. Rows are mappings defined on a set of attributes (problem variables) such that the image of an attribute A, denoted r(a), is a subset of A s domain (the values which it may assume). A table is a set of rows. The scheme of a row or table is the set of attributes on which it is defined. The partial ordering covers on the set of all rows (over a fixed scheme) is component-wise set inclusion, i.e., row s is covered by r ifs(a) G r(a) for each attribute A. The meet and join of a pair of rules are their component-wise intersection and union, respectively. Flat rows are those whose components are either singletons or empty. (Empty components represents missing information, unknown values.) The jlat extension of table R is the table consisting of all flat rows covered by at least one row in R. A table S is said to subsume relation R if the flat extension of R is a subset of the flat extension of S. Two relations are equivalent if each subsumes the other. A transformation that transforms table R into table S is equivalence-preserving if R is equivalent to S. A decision table represents a function assigning classifications to conditions and has a scheme consisting of condition attributes and a classl~cation attribute. The classification of a condition c (a row whose classification entry is empty) by decision table T, denoted T(c), is the union of the classifications of all rows of T that cover the condition. A simple condition is a condition with singleton values for all condition attributes. A decision table is consistent if it associates at most one classification to any simple condition. A decision table is complete if it classifies (gives a nonempty value to) all simple conditions. The transformation of a table R into table S is consistency-preserving if(1) every simple condition classified by R is given the same classification(s) by S, and (2) for any simple condition c not classified by R, IS(c) I < Basic algorithm The basic induction algorithm in Figure 1 starts with a flat table of training data and, by repeated transformation, produces a general table (covering more conditions) subsuming the original. At each step, the table is an approximation to the unknown target function. The transformations correspond to a search, through a hypothesis space of second-order tables, for a suitable approximating
4 388 Data Mining III Input: a decision table T output: a decision table R such that R is consistentwith T and the size of R is minimalor near minimalwithin cost constraints. (1) Apply equivalence-preserving transformations, guidedby heuristics, subjectto costconstraints. (2) Infer additionalrulesor additionalattributevaluesfor componentsof individualrules. (3) RepeatSteps (1) and (2) until neither changes the relation. (4) Apply consistency-preserving transformations, guided by heuristics, subject to cost constraints. (5) Go to Step (l). Stop when no further transformation has occurred within the cost constraints. Figure 1: Basic induction. function. Equivalence-preserving transformations may include delete redundant rules (remove rows subsumed by other rows of the table) and merge joinable (replace a pair of rows agreeing on all attributes except one by their join). An example of a consistency-preserving transformation is merge consistent, merge a pair of rules whose join does not introduce inconsistency. Such a pair is said to be consistently joinable, and their merge may add new conditions, generalizing the table, without creating inconsistency. SORCER provides another type of transformation, inclusion of statistically determined rules. An example is add high probability rows ~), where p specifies a minimum accuracy. Currently, SORCER only considers rules with one condition attribute. For example, ifp = 0.90, the rule (xl = a) => (Class = O) is added to the table if (Class = O) for at least 90 Aof the training data set examples in which (A = a). Statistically determined rules may fail to preserve consistency, and, currently, SORCER only applies them to flat tables. Conceptually, equivalence-preserving transformations can be used for data compacting and to identifi meaningful clusters of values, both of which aid comprehensibility. Consistency-preserving transformations can generalize a table to cover more conditions, which may also simpli~ classification rules. Inclusion of statistically determined rules allows creation of simple rules (i.e., based on fewer condition attributes) with a specified levels of accuracy. Time complexity of SORCER s induction depends on the transformations applied. For example, merge joinable is 0(kn2) and merge consistently joinable pairs to a fixed point is 0(kn3), where n is table length and k is the sum of the attributes domain sizes. Details are in [7]. Since many decision problems involving second-order tables (e.g., determining whether a table covers a row) are NP-hard [7], resource constraints (e.g., number of iterations) may be applied for operations likely to be prohibitively expensive. Heuristics based on domain knowledge, such as attributes ranking by discriminatory power, could help select appropriate operation or rows. The rule set produced by the algorithm may not be complete. For conditions not covered by the model, a rule is selected heuristically to provide a classification.
5 Data Mining III 389 The heuristics include a preference for rules that (1) cover the query on more attributes, (2) cover fewer conditions, and (3) give the most common classification appearing in the table. More details of SORCER are in [7]. 3 Methodology The issue in scaling up is often not speed, per se, but the size of the data set that can be handled. Scaling up learning algorithms involves finding technique to make impractical algorithms practical. Many approaches have been proposed for scaling up inductive algorithms, including designing fast algorithms and data partitioning. The first approach either develops efficient algorithms or increases efficiency of existing algorithms. The data partitioning approach uses a divideand-conquer strategy to deal with huge data sets, applying the algorithm to one or more subsets of the data and possibly combining results. Consequently, an algorithm with time complexity worse than linear in the number of examples may be made linear with the constant term dependent on the size of the subsets [5]. A survey paper by Provost and Kolluri [11] provides a comprehensive description of a variety of scaling up techniques. For this paper, we employ the data partitioning. We next describe our scaling up approach and the compression algorithms applied in our experiments. 3.1 Scaling up methods Techniques for data partitioning can be categorized by several dimensions based on how data subsets are (1) separated (e.g., by instances, or features), (2) selected Training Data Set + I Figure 2: A conceptual view of a data partitioning approach.
6 390 Data Mining III (e.g., sampling, partitioning), (3) trained and processed (e.g., concurrently, sequentially - incremental batch learning, model-guided instance selection), and (4) how the resulting models are produced (e.g., combine predictions) [11]. Figure 2 gives a general model of a data partitioning approach. A selection procedure selects one or more subsets Ti (i = 1,..., k) of a large training data set. Each subset T is used as a training set for a learning algorithm A to produce a classification model Ci. An aggregation procedure then uses results from classifiers Ci s to produce afmal classifier, C. One advantage of this model is that it provides an independent multi-subset learning. Thus, each learning process can be run concurrently. Specific methods for scaling up are varied by at least three factors: the learning algorithm, the selection, and the aggregation procedures, as shown in the oval shapes of Figure 2. In general, different learning algorithms can be used to build each classifier. For this paper, we focus on SORCERS basic induction algorithm, which varies depending on the transformations applied (as discussed in Section 3.2). We use two instance subset selection procedures (random partitioning and sampling with replacement) and two aggregation procedures (majority voting and selecting the best model for a validation set). Majority voting combines a set of classifiers by taking the union of rules in the classifiers and resolving inconsistencies by eliminating rules with less frequent classifications. The term combine when applied to classifiers refers to the majority voting aggregation procedure. We describe the four specific combinations of these procedures used in our experiments below. Method 1: Random partition the training set into subsets, obtain a classifier from each subset, and select the classifier that performs best (highest accuracy) on a validation set. Method 2: Random partition the training set into subsets, obtain a classifier from each subset, and combine all classifiers into a final classifier Method 3: Random sample without replacement, greedily cover the training set by incremental combination of a classifier obtained ti-om current sample (Ci) with the current best combined classifier obtained from previous samples (C*), as long as the new combined classifier of Ci and C* (CCO~~) performs better than C* on a validation set. Each time a new classifier is combined into C*, update C* to c..~b. A final classifier is C*. Method 4: Same as Method 3 except that m classifiers (each trained from different samples) are considered for combining with the current best combined classifier at a time, instead of one at a time as in Method 3. Method 2 is most similar to a bagging approach except that in bagging, subsets are randomly sampled with replacement from a training data set. As shown in Figure 2, each of these methods applies the same learning algorithm, for each training subset. We conduct four experiments with three compression algorithms. Each of the three experiments applies each compression algorithm. In the final experiment, for each training subset, all the three compression algorithms are applied to produce three classifiers and the classifier with the
7 Data Mining III 391 Alg. Al A2 A3 Transformations and Operations Merge consistent Merge joinable Merge consistent Add high probability rows p Merge consistent Figure 3: Three compression algorithms. highest accuracy on a validation set is selected. We refer to these experiments as experiments with algorithms Al, A2, A3, and Best, respectively. The next section describes the compression algorithms AI-A Compression algorithms The transformations described in Section 2.2 can be applied in various combinations to create different compression algorithms. Algorithms used in our experiments are summarized in Figure 3. These algorithms are representatives of induction using basic transformations which can be specified easily in SORCER by generating script files of SORCER commands. Al merges pairs of consistently joinable rows until no more consistent joining is possible. It is the simplest compression with generalization. Applying equivalence-preserving transformation, such as merge joinable, gives a classifier that simply remembers all seen cases. A2 fwst merges locally joinable pairs, until no more such joins are possible, and then merges pairs of consistently joinable rows until no more consistent joining is possible. By applying merge joinable before merge consistent, A2 attempts to give priority to generalization according to the structure of the knowledge partially formed by equivalencepreserving transformation of a training data set. A3 adds statistically determined rules whose accuracy exceeds a specified threshold before applying transformations used in Al. For the experiments in this paper, we used the threshold p = 0.9. Since the order of training data may affect the result of compression, we had SORCER shuffle the training data before applying the compression algorithm. 4 Experiments and results The GIS data, obtained fi-om the UCI KDD Repository [1], contains 581,012 instances of 30x30 meter cells with 54 attributes for classifying forest cover types. There are 44 binary attributes, ten attributes with continuous values and seven classes of forest cover types. The class frequencies vary from classes with occurrences of 48.7 %0and 36.5 %oto 0.5 Aof the data. We randomly selected 181,012 of data instances for testing, 395,000 for training, and 5,000 for validation. A random sample of size 15,800 from the training set is used by
8 392 Data Mining III ~ ##w=-: ~ 78 A Ed? A d A Alg.Tr. Time(sec) Ace. o m 65? al 1000 h J) Al w A Cl ITJ u L A E w 60 ~ A A w < Results on the entire 10,~ 5 TO Al ,? 42 ~; A3 x AccuPacq Q I 5@ l@a@b8 le+@ Training Size training data. Figure 4: Training time and accuracy obtained from different training set sizes. SORCER to discretize continuous attributes and the boundaries obtained are used for discretizing the rest of the data. To obtain a consistent classifier, SORCER resolves inconsistencies by retaining instances that occur more frequently. Weranexperiments ona Pentium III, 500MHz PCwith256Mbof memory. We first observe a learning curve for the data set by running SORCER on samples with sizes varying flom 100 instances to the entire training data set. Accuracies are for the testing data set. Figure 4 shows the training time of each sample using algorithms A1-A3 and the average learning curve (over three runs of each algorithm). Most of the training time is used to resolve inconsistency, as shown by Tin Figure 4. A training set of size 15.8 K gives an average accuracy of 70.2 Aand from size K on, accuracy no longer improves. The right of Figure 4 shows that the three algorithms produce, fi-om the entire training data set, classifiers with accuracy of /0with slightly different training times. Based on this result, we decide to use partitioned subsets and 25 random samples of size 15.8 K for each method described in Section 3. Figure 5 summarizes results obtained from each method and algorithm. Loading data and classifier aggregation, each took a few seconds, and since this time is essentially the same for all the algorithms, we exclude them from the training time in Col 2. Col 3 shows the time SORCER took to transform the final classifiers of the size (table length) shown in Col 1 to equivalent classifiers of the size shown in Col 5. Classifiers in Method 1 cannot be compressed finther since they are not combined classifiers. We compare total time (i.e., Cols 3 and 4) of each algorithm with the time spent on training with the entire training data set (as in Figure 4, except for Best, we use an average training time of all algorithms) and show a percent reduction of the training time in Col 4. As shown in Col 6, the accuracies obtained are at worst 1 0/0 and at best 0.1 0/0lower
9 Data Mining III 393 Col Figure 5: Results from data partitioning approaches. than the accuracy obtained fi-om the entire training set. There is no large difference between accuracies obtained from different algorithms. However, Method 4 (with m = 5) seems to give slightly higher accuracy than others, while Method 3 was fastest with only slightly lower accuracy. In general, there is no great lost in accuracy by using a data partitioning technique but there is a large decrease in training time, 91.5% on the average for the fastest method. 5 Related work and conclusions Sampling is a common technique for scaling up classification algorithms to large data sets [4, 11]. However, the question of how large a training sample should be to achieve optimal accuracy (highest achievable with the entire data) is not obvious. Recent work on progressive sampling (PS) [12] provides an efficient search for a suitable sample size. Though we do not explicitly study the effect of PS, we applied its concept to observe SORCERS learning curve and select a sample size for our experiments. Several data partitioning techniques have been proposed for scaling up [11]. Work in ensemble learning has shown that combining the output of a set of classifiers that are independently trained fi-om random data samples can greatly improve accuracy [3]. Like other ensemble learning, bagging has been studied in the context of accuracy improvement. Here we use bagging concept in studying scaling up. Unlike ordinary bagging, we use SORCER to fhrther compressa combined classifierintoa smallermodel for enhanced comprehensibility. Other experiments on the GIS data set have been published. Blackard [2] reported 70 /0 accuracy obtained using neural net back propagation and 58 /0 accuracy using linear discriminant analysis. Gu et al. [6] propose an efficient technique to fmd a good starting sample size for PS. By using a decision tree learner, C5.0, the improved version of C4.5 [13], accuracies of 73% and 75.8%
10 3% Data Mining III were obtained on an initial sample size and the entire training data of 400K instances, respectively. However, these results used different experimental settings and thus, give only a rough idea of where SORCER S performance stands. Our experiment shows that the tradeoff between increased accuracy, using larger training sets, and time efficiency, using smaller training sets, is an important consideration for scaling up learning algorithms. We view this work as a fust step toward scaling up techniques for SORCER. Many improvements are possible including program optimization within SORCER and more efficient ways to deal with inconsistent data. We also plan to investigate feature subset selection for scaling up. References [1] [2] [3] [4] [5] [6] [7] [8] [9] Bay, S.D. The UCI KDD Archive, edu, Blackard, J. A., Comparison of Neural Networks and Discrirninant Analysis in Predicting Forest Cover Types. Ph.D. dissertation, Department of Forest Sciences, Colorado State University, Fort Collins, Colorado, Breiman, L., Bagging predictors. Machine Learning, 24(2), pp , Catlett, J., Megainduction: A test flight. Proceedings of the 8 h International Workshop on Machine Learning, Morgan Kaufmann, pp , Domin o, P., Efficient specific-to-general rule induction. Proceedings of $ the 2 International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA, AAAI Press, pp , Gu, B., B. Liu, F. Hu and H. Liu, Efficiently determine the starting sample size for progressive sampling. Proceedings of 12th European Conference on Machine Learning, Freiburg, Germany, Hewett, R. and J. Leuchner, The Power of Second-Order Decision Tables. Proceedings of the 2n~SIAM International Conference on Data Mining, pp ,2002. Huber, P., From large to huge: a statistician s reaction to KDD and DM. Proceedings of the 3r~ International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA, AAAI Press, pp , Leuchner, J. and R. Hewett, A Formal Framework for Large Decision Tables. Proceedings of Conference on Knowledge Retrieval, Use and Storage for E@ciency, , [10] Oates, T. and D. Jensen, Large data sets lead to overly complex models: an explanation and a solution. Proceedings of the 4th International Conference on Knowledge Discove~ and Data Mining, pp , [11] Provost, F. and V. Kolluri, A survey of methods for scaling up inductive algorithms. Data Mining and Knowledge Discovery, 2, pp. 1-42, [12] Provost, F., D. Jensen and T. Oates, Efficient progressive sampling. Proceedings of the 5 h International Conference on Knowledge Discoveiy and Data Mining, AAAI/MIT Press, [13] Quinlan, J., C4.5: Programs for Machine Learning, San Mateo, CA: Morgan Kaufmann, 1993.
Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees
Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees Jing Wang Computer Science Department, The University of Iowa jing-wang-1@uiowa.edu W. Nick Street Management Sciences Department,
More informationRank Measures for Ordering
Rank Measures for Ordering Jin Huang and Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 email: fjhuang33, clingg@csd.uwo.ca Abstract. Many
More informationImproving the Random Forest Algorithm by Randomly Varying the Size of the Bootstrap Samples for Low Dimensional Data Sets
Improving the Random Forest Algorithm by Randomly Varying the Size of the Bootstrap Samples for Low Dimensional Data Sets Md Nasim Adnan and Md Zahidul Islam Centre for Research in Complex Systems (CRiCS)
More informationCloNI: clustering of JN -interval discretization
CloNI: clustering of JN -interval discretization C. Ratanamahatana Department of Computer Science, University of California, Riverside, USA Abstract It is known that the naive Bayesian classifier typically
More informationCse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University
Cse634 DATA MINING TEST REVIEW Professor Anita Wasilewska Computer Science Department Stony Brook University Preprocessing stage Preprocessing: includes all the operations that have to be performed before
More informationEfficiently Determine the Starting Sample Size
Efficiently Determine the Starting Sample Size for Progressive Sampling Baohua Gu Bing Liu Feifang Hu Huan Liu Abstract Given a large data set and a classification learning algorithm, Progressive Sampling
More informationNaïve Bayes for text classification
Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support
More informationInternational Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X
Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,
More informationChallenges and Interesting Research Directions in Associative Classification
Challenges and Interesting Research Directions in Associative Classification Fadi Thabtah Department of Management Information Systems Philadelphia University Amman, Jordan Email: FFayez@philadelphia.edu.jo
More informationLeveraging Set Relations in Exact Set Similarity Join
Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,
More informationRandom Forest A. Fornaser
Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University
More informationCHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES
70 CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES 3.1 INTRODUCTION In medical science, effective tools are essential to categorize and systematically
More informationA Genetic Algorithm-Based Approach for Building Accurate Decision Trees
A Genetic Algorithm-Based Approach for Building Accurate Decision Trees by Z. Fu, Fannie Mae Bruce Golden, University of Maryland S. Lele,, University of Maryland S. Raghavan,, University of Maryland Edward
More informationInternational Journal of Software and Web Sciences (IJSWS)
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0063 ISSN (Online): 2279-0071 International
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationData Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation
Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization
More informationEvolving SQL Queries for Data Mining
Evolving SQL Queries for Data Mining Majid Salim and Xin Yao School of Computer Science, The University of Birmingham Edgbaston, Birmingham B15 2TT, UK {msc30mms,x.yao}@cs.bham.ac.uk Abstract. This paper
More informationWEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1
WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1 H. Altay Güvenir and Aynur Akkuş Department of Computer Engineering and Information Science Bilkent University, 06533, Ankara, Turkey
More informationCse352 Artifficial Intelligence Short Review for Midterm. Professor Anita Wasilewska Computer Science Department Stony Brook University
Cse352 Artifficial Intelligence Short Review for Midterm Professor Anita Wasilewska Computer Science Department Stony Brook University Midterm Midterm INCLUDES CLASSIFICATION CLASSIFOCATION by Decision
More informationApril 3, 2012 T.C. Havens
April 3, 2012 T.C. Havens Different training parameters MLP with different weights, number of layers/nodes, etc. Controls instability of classifiers (local minima) Similar strategies can be used to generate
More informationIJMIE Volume 2, Issue 9 ISSN:
WEB USAGE MINING: LEARNER CENTRIC APPROACH FOR E-BUSINESS APPLICATIONS B. NAVEENA DEVI* Abstract Emerging of web has put forward a great deal of challenges to web researchers for web based information
More informationDistributed Pasting of Small Votes
Distributed Pasting of Small Votes N. V. Chawla 1,L.O.Hall 1,K.W.Bowyer 2, T. E. Moore, Jr. 1,and W. P. Kegelmeyer 3 1 Department of Computer Science and Engineering, University of South Florida 4202 E.
More informationData Mining Course Overview
Data Mining Course Overview 1 Data Mining Overview Understanding Data Classification: Decision Trees and Bayesian classifiers, ANN, SVM Association Rules Mining: APriori, FP-growth Clustering: Hierarchical
More informationForward Feature Selection Using Residual Mutual Information
Forward Feature Selection Using Residual Mutual Information Erik Schaffernicht, Christoph Möller, Klaus Debes and Horst-Michael Gross Ilmenau University of Technology - Neuroinformatics and Cognitive Robotics
More informationData Preprocessing. Slides by: Shree Jaswal
Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data
More informationsize, runs an existing induction algorithm on the rst subset to obtain a rst set of rules, and then processes each of the remaining data subsets at a
Multi-Layer Incremental Induction Xindong Wu and William H.W. Lo School of Computer Science and Software Ebgineering Monash University 900 Dandenong Road Melbourne, VIC 3145, Australia Email: xindong@computer.org
More informationEfficient SQL-Querying Method for Data Mining in Large Data Bases
Efficient SQL-Querying Method for Data Mining in Large Data Bases Nguyen Hung Son Institute of Mathematics Warsaw University Banacha 2, 02095, Warsaw, Poland Abstract Data mining can be understood as a
More informationBig Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1
Big Data Methods Chapter 5: Machine learning Big Data Methods, Chapter 5, Slide 1 5.1 Introduction to machine learning What is machine learning? Concerned with the study and development of algorithms that
More informationHALF&HALF BAGGING AND HARD BOUNDARY POINTS. Leo Breiman Statistics Department University of California Berkeley, CA
1 HALF&HALF BAGGING AND HARD BOUNDARY POINTS Leo Breiman Statistics Department University of California Berkeley, CA 94720 leo@stat.berkeley.edu Technical Report 534 Statistics Department September 1998
More informationClosed Non-Derivable Itemsets
Closed Non-Derivable Itemsets Juho Muhonen and Hannu Toivonen Helsinki Institute for Information Technology Basic Research Unit Department of Computer Science University of Helsinki Finland Abstract. Itemset
More informationImproving the Efficiency of Fast Using Semantic Similarity Algorithm
International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year
More informationFeature Selection Based on Relative Attribute Dependency: An Experimental Study
Feature Selection Based on Relative Attribute Dependency: An Experimental Study Jianchao Han, Ricardo Sanchez, Xiaohua Hu, T.Y. Lin Department of Computer Science, California State University Dominguez
More informationMinimal Test Cost Feature Selection with Positive Region Constraint
Minimal Test Cost Feature Selection with Positive Region Constraint Jiabin Liu 1,2,FanMin 2,, Shujiao Liao 2, and William Zhu 2 1 Department of Computer Science, Sichuan University for Nationalities, Kangding
More informationEnsemble Learning. Another approach is to leverage the algorithms we have via ensemble methods
Ensemble Learning Ensemble Learning So far we have seen learning algorithms that take a training set and output a classifier What if we want more accuracy than current algorithms afford? Develop new learning
More informationAn Information-Theoretic Approach to the Prepruning of Classification Rules
An Information-Theoretic Approach to the Prepruning of Classification Rules Max Bramer University of Portsmouth, Portsmouth, UK Abstract: Keywords: The automatic induction of classification rules from
More informationWeb Service Usage Mining: Mining For Executable Sequences
7th WSEAS International Conference on APPLIED COMPUTER SCIENCE, Venice, Italy, November 21-23, 2007 266 Web Service Usage Mining: Mining For Executable Sequences MOHSEN JAFARI ASBAGH, HASSAN ABOLHASSANI
More informationClassification with Diffuse or Incomplete Information
Classification with Diffuse or Incomplete Information AMAURY CABALLERO, KANG YEN Florida International University Abstract. In many different fields like finance, business, pattern recognition, communication
More informationStructure of Association Rule Classifiers: a Review
Structure of Association Rule Classifiers: a Review Koen Vanhoof Benoît Depaire Transportation Research Institute (IMOB), University Hasselt 3590 Diepenbeek, Belgium koen.vanhoof@uhasselt.be benoit.depaire@uhasselt.be
More informationKEYWORDS: Clustering, RFPCM Algorithm, Ranking Method, Query Redirection Method.
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY IMPROVED ROUGH FUZZY POSSIBILISTIC C-MEANS (RFPCM) CLUSTERING ALGORITHM FOR MARKET DATA T.Buvana*, Dr.P.krishnakumari *Research
More informationValue Added Association Rules
Value Added Association Rules T.Y. Lin San Jose State University drlin@sjsu.edu Glossary Association Rule Mining A Association Rule Mining is an exploratory learning task to discover some hidden, dependency
More informationA Bagging Method using Decision Trees in the Role of Base Classifiers
A Bagging Method using Decision Trees in the Role of Base Classifiers Kristína Machová 1, František Barčák 2, Peter Bednár 3 1 Department of Cybernetics and Artificial Intelligence, Technical University,
More informationBoosting Algorithms for Parallel and Distributed Learning
Distributed and Parallel Databases, 11, 203 229, 2002 c 2002 Kluwer Academic Publishers. Manufactured in The Netherlands. Boosting Algorithms for Parallel and Distributed Learning ALEKSANDAR LAZAREVIC
More informationSandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing
Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications
More informationSIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING
SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING TAE-WAN RYU AND CHRISTOPH F. EICK Department of Computer Science, University of Houston, Houston, Texas 77204-3475 {twryu, ceick}@cs.uh.edu
More informationAn Improved Apriori Algorithm for Association Rules
Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan
More informationCHAPTER 1 INTRODUCTION
CHAPTER 1 INTRODUCTION 1.1 Introduction Pattern recognition is a set of mathematical, statistical and heuristic techniques used in executing `man-like' tasks on computers. Pattern recognition plays an
More informationBagging Is A Small-Data-Set Phenomenon
Bagging Is A Small-Data-Set Phenomenon Nitesh Chawla, Thomas E. Moore, Jr., Kevin W. Bowyer, Lawrence O. Hall, Clayton Springer, and Philip Kegelmeyer Department of Computer Science and Engineering University
More informationUsing Text Learning to help Web browsing
Using Text Learning to help Web browsing Dunja Mladenić J.Stefan Institute, Ljubljana, Slovenia Carnegie Mellon University, Pittsburgh, PA, USA Dunja.Mladenic@{ijs.si, cs.cmu.edu} Abstract Web browsing
More informationMining High Order Decision Rules
Mining High Order Decision Rules Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 e-mail: yyao@cs.uregina.ca Abstract. We introduce the notion of high
More informationHandling Missing Values via Decomposition of the Conditioned Set
Handling Missing Values via Decomposition of the Conditioned Set Mei-Ling Shyu, Indika Priyantha Kuruppu-Appuhamilage Department of Electrical and Computer Engineering, University of Miami Coral Gables,
More informationGraph Matching: Fast Candidate Elimination Using Machine Learning Techniques
Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques M. Lazarescu 1,2, H. Bunke 1, and S. Venkatesh 2 1 Computer Science Department, University of Bern, Switzerland 2 School of
More informationResearch on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a
International Conference on Education Technology, Management and Humanities Science (ETMHS 2015) Research on Applications of Data Mining in Electronic Commerce Xiuping YANG 1, a 1 Computer Science Department,
More informationSummary of Last Chapter. Course Content. Chapter 3 Objectives. Chapter 3: Data Preprocessing. Dr. Osmar R. Zaïane. University of Alberta 4
Principles of Knowledge Discovery in Data Fall 2004 Chapter 3: Data Preprocessing Dr. Osmar R. Zaïane University of Alberta Summary of Last Chapter What is a data warehouse and what is it for? What is
More informationCluster quality 15. Running time 0.7. Distance between estimated and true means Running time [s]
Fast, single-pass K-means algorithms Fredrik Farnstrom Computer Science and Engineering Lund Institute of Technology, Sweden arnstrom@ucsd.edu James Lewis Computer Science and Engineering University of
More informationData Analytics and Boolean Algebras
Data Analytics and Boolean Algebras Hans van Thiel November 28, 2012 c Muitovar 2012 KvK Amsterdam 34350608 Passeerdersstraat 76 1016 XZ Amsterdam The Netherlands T: + 31 20 6247137 E: hthiel@muitovar.com
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More informationEnhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques
24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE
More informationProgress Report: Collaborative Filtering Using Bregman Co-clustering
Progress Report: Collaborative Filtering Using Bregman Co-clustering Wei Tang, Srivatsan Ramanujam, and Andrew Dreher April 4, 2008 1 Introduction Analytics are becoming increasingly important for business
More informationA Review on Cluster Based Approach in Data Mining
A Review on Cluster Based Approach in Data Mining M. Vijaya Maheswari PhD Research Scholar, Department of Computer Science Karpagam University Coimbatore, Tamilnadu,India Dr T. Christopher Assistant professor,
More informationEfficient Case Based Feature Construction
Efficient Case Based Feature Construction Ingo Mierswa and Michael Wurst Artificial Intelligence Unit,Department of Computer Science, University of Dortmund, Germany {mierswa, wurst}@ls8.cs.uni-dortmund.de
More informationA Novel Algorithm for Associative Classification
A Novel Algorithm for Associative Classification Gourab Kundu 1, Sirajum Munir 1, Md. Faizul Bari 1, Md. Monirul Islam 1, and K. Murase 2 1 Department of Computer Science and Engineering Bangladesh University
More information2. Literature Review
Bagging Is A Small-Data-Set Phenomenon Nitesh Chawla l, Thomas E. Moore, Jr., Kevin W. Bowyer2, Lawrence 0. Hall1, Clayton Springe$, and Philip Kegelmeyes ldepartment of Computer Science and Engineering
More informationMultiple Classifier Fusion using k-nearest Localized Templates
Multiple Classifier Fusion using k-nearest Localized Templates Jun-Ki Min and Sung-Bae Cho Department of Computer Science, Yonsei University Biometrics Engineering Research Center 134 Shinchon-dong, Sudaemoon-ku,
More informationData Access Paths for Frequent Itemsets Discovery
Data Access Paths for Frequent Itemsets Discovery Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science {marekw, mzakrz}@cs.put.poznan.pl Abstract. A number
More informationIntroduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.
Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How
More informationA Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values
A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values Patrick G. Clark Department of Electrical Eng. and Computer Sci. University of Kansas Lawrence,
More informationDynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers
Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers A. Srivastava E. Han V. Kumar V. Singh Information Technology Lab Dept. of Computer Science Information Technology Lab Hitachi
More informationFuzzy Partitioning with FID3.1
Fuzzy Partitioning with FID3.1 Cezary Z. Janikow Dept. of Mathematics and Computer Science University of Missouri St. Louis St. Louis, Missouri 63121 janikow@umsl.edu Maciej Fajfer Institute of Computing
More informationLecturer 2: Spatial Concepts and Data Models
Lecturer 2: Spatial Concepts and Data Models 2.1 Introduction 2.2 Models of Spatial Information 2.3 Three-Step Database Design 2.4 Extending ER with Spatial Concepts 2.5 Summary Learning Objectives Learning
More informationOrdering attributes for missing values prediction and data classification
Ordering attributes for missing values prediction and data classification E. R. Hruschka Jr., N. F. F. Ebecken COPPE /Federal University of Rio de Janeiro, Brazil. Abstract This work shows the application
More informationImproved Frequent Pattern Mining Algorithm with Indexing
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VII (Nov Dec. 2014), PP 73-78 Improved Frequent Pattern Mining Algorithm with Indexing Prof.
More informationPerceptron-Based Oblique Tree (P-BOT)
Perceptron-Based Oblique Tree (P-BOT) Ben Axelrod Stephen Campos John Envarli G.I.T. G.I.T. G.I.T. baxelrod@cc.gatech sjcampos@cc.gatech envarli@cc.gatech Abstract Decision trees are simple and fast data
More informationA Parallel Evolutionary Algorithm for Discovery of Decision Rules
A Parallel Evolutionary Algorithm for Discovery of Decision Rules Wojciech Kwedlo Faculty of Computer Science Technical University of Bia lystok Wiejska 45a, 15-351 Bia lystok, Poland wkwedlo@ii.pb.bialystok.pl
More informationCLASSIFICATION FOR SCALING METHODS IN DATA MINING
CLASSIFICATION FOR SCALING METHODS IN DATA MINING Eric Kyper, College of Business Administration, University of Rhode Island, Kingston, RI 02881 (401) 874-7563, ekyper@mail.uri.edu Lutz Hamel, Department
More informationAn Evolutionary Algorithm for Mining Association Rules Using Boolean Approach
An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach ABSTRACT G.Ravi Kumar 1 Dr.G.A. Ramachandra 2 G.Sunitha 3 1. Research Scholar, Department of Computer Science &Technology,
More informationExtended R-Tree Indexing Structure for Ensemble Stream Data Classification
Extended R-Tree Indexing Structure for Ensemble Stream Data Classification P. Sravanthi M.Tech Student, Department of CSE KMM Institute of Technology and Sciences Tirupati, India J. S. Ananda Kumar Assistant
More informationA Two Stage Zone Regression Method for Global Characterization of a Project Database
A Two Stage Zone Regression Method for Global Characterization 1 Chapter I A Two Stage Zone Regression Method for Global Characterization of a Project Database J. J. Dolado, University of the Basque Country,
More informationImproving Classifier Performance by Imputing Missing Values using Discretization Method
Improving Classifier Performance by Imputing Missing Values using Discretization Method E. CHANDRA BLESSIE Assistant Professor, Department of Computer Science, D.J.Academy for Managerial Excellence, Coimbatore,
More informationGenetic Programming for Data Classification: Partitioning the Search Space
Genetic Programming for Data Classification: Partitioning the Search Space Jeroen Eggermont jeggermo@liacs.nl Joost N. Kok joost@liacs.nl Walter A. Kosters kosters@liacs.nl ABSTRACT When Genetic Programming
More informationOPTIMIZATION OF BAGGING CLASSIFIERS BASED ON SBCB ALGORITHM
OPTIMIZATION OF BAGGING CLASSIFIERS BASED ON SBCB ALGORITHM XIAO-DONG ZENG, SAM CHAO, FAI WONG Faculty of Science and Technology, University of Macau, Macau, China E-MAIL: ma96506@umac.mo, lidiasc@umac.mo,
More informationFeature Construction and δ-free Sets in 0/1 Samples
Feature Construction and δ-free Sets in 0/1 Samples Nazha Selmaoui 1, Claire Leschi 2, Dominique Gay 1, and Jean-François Boulicaut 2 1 ERIM, University of New Caledonia {selmaoui, gay}@univ-nc.nc 2 INSA
More informationAN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE
AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE Vandit Agarwal 1, Mandhani Kushal 2 and Preetham Kumar 3
More informationThis tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.
About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationThe digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand).
http://waikato.researchgateway.ac.nz/ Research Commons at the University of Waikato Copyright Statement: The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand). The thesis
More informationFeature Selection with Adjustable Criteria
Feature Selection with Adjustable Criteria J.T. Yao M. Zhang Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: jtyao@cs.uregina.ca Abstract. We present a
More informationA Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition
A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition Ana Zelaia, Olatz Arregi and Basilio Sierra Computer Science Faculty University of the Basque Country ana.zelaia@ehu.es
More informationCSE 6242/CX Ensemble Methods. Or, Model Combination. Based on lecture by Parikshit Ram
CSE 6242/CX 4242 Ensemble Methods Or, Model Combination Based on lecture by Parikshit Ram Numerous Possible Classifiers! Classifier Training time Cross validation Testing time Accuracy knn classifier None
More informationDecision Trees Dr. G. Bharadwaja Kumar VIT Chennai
Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining
More informationTable Of Contents: xix Foreword to Second Edition
Data Mining : Concepts and Techniques Table Of Contents: Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments xxxi About the Authors xxxv Chapter 1 Introduction 1 (38) 1.1 Why Data
More informationGraph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References q Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationBipartite Graph Partitioning and Content-based Image Clustering
Bipartite Graph Partitioning and Content-based Image Clustering Guoping Qiu School of Computer Science The University of Nottingham qiu @ cs.nott.ac.uk Abstract This paper presents a method to model the
More informationA Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995)
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Department of Information, Operations and Management Sciences Stern School of Business, NYU padamopo@stern.nyu.edu
More informationComparative Study of Subspace Clustering Algorithms
Comparative Study of Subspace Clustering Algorithms S.Chitra Nayagam, Asst Prof., Dept of Computer Applications, Don Bosco College, Panjim, Goa. Abstract-A cluster is a collection of data objects that
More informationData Mining, Parallelism, Data Mining, Parallelism, and Grids. Queen s University, Kingston David Skillicorn
Data Mining, Parallelism, Data Mining, Parallelism, and Grids David Skillicorn Queen s University, Kingston skill@cs.queensu.ca Data mining builds models from data in the hope that these models reveal
More informationUsing Machine Learning to Optimize Storage Systems
Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation
More informationPerformance Analysis of Data Mining Classification Techniques
Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal
More informationQuestion Bank. 4) It is the source of information later delivered to data marts.
Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile
More informationA Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition
A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition Ana Zelaia, Olatz Arregi and Basilio Sierra Computer Science Faculty University of the Basque Country ana.zelaia@ehu.es
More information