OPTIMIZING MIXTURES OF LOCAL EXPERTS IN TREE-LIKE REGRESSION MODELS
|
|
- Sara Robbins
- 5 years ago
- Views:
Transcription
1 Proc. IASTED Conference on Artificial Intelligence and Apllications, M.H. Hamza, (ed), Innsbruck, Austria, February 2005, OPTIMIZING MIXTURES OF LOCAL EXPERTS IN TREE-LIKE REGRESSION MODELS Michael Baskara L. A. SIEK Prima Intelligence, Inc. P.O. Box 315 Surabaya, Indonesia Dimitri P. SOLOMATINE UNESCO-IHE Institute for Water Education P.O. Box 3015 Delft The Netherlands (corresponding author) ABSTRACT A mixture of local experts consists of a set of specialized models each of which is responsible for a particular local region of input space. Many algorithms in this class, for example M5 model tree, are sub-optimal (greedy). An algorithm for building optimal local mixtures of regression experts is proposed, and compared to MLP ANN on a number of cases. KEY WORDS Machine learning, mixtures, local models, regression. 1. Mixtures of Local Experts A complex machine learning problem can be solved by dividing it into a number of simple tasks and combining the solutions of these tasks. The input space can be divided into a number of regions (subsets of data) for each of which a separate specialized model (expert, module) built. Outputs of the experts are combined. Such models are named committee machines, mixtures of experts, modular models, stacked models, etc. ([12], [14]). One can use two criteria to classify such models: how experts are combined and on which data they are trained. The way experts are combined falls into one of the two major categories: (1) static, where response of experts is combined by a mechanism that does not involve the input signal, e.g., using fixed weights. Examples are ensemble averaging (where separate experts are built for the whole input space and then averaged) and boosting. (2) dynamic, where experts are combined using weighting schemes depending on the input vector. Example is a statistically-driven approach of Jacobs, Jordan, Nowland and Hinton [14] (called mixture of experts). The regions (or subsets of data) for which experts are responsible can be constructed in two ways: (1) in a probabilistic fashion so that they may have repetitive examples and be intersecting. This is done, e.g., in boosting [6], [28]. (2) as a result of hard splitting of input space. Each individual expert is trained individually on subsets of instances contained in these local regions, and finally the output of only one specialized expert is taken into consideration. Indeed, if the regions are non-intersecting then there is no reason to combine the outputs of different experts or modules and only one of them is explicitly used (a particular case when the weights of other experts are zero). In tree-like models such regions are constructed progressively narrowing the regions of the input space. The result is a hierarchy, a tree (often a binary one) with splitting rules in non-terminal nodes and the expert models in leaves (Fig. 1). Such models will be called in this paper mixtures of local experts (MLEs) and the experts will be referred to as modules, or specialized models. Models in MLEs could be of any type. If the model output is a nominal variable so that the classification problem is to be solved, then one of the popular methods is a decision tree. For solving numerical prediction (regression) problem, there is a number of methods that are based on the idea of a decision tree: (a) regression tree by Breiman et al. [4], where a leave is associated with an average output value of the instances sorted down to it (zero-order model), and (b) model tree, where leaves have regression functions of the input variables. In model trees, two approaches can be distinguished: one by Friedman [10] in the MARS (multiple adaptive regression splines) algorithm implemented as MARS software, and M5 model trees by Quinlan [20] implemented in Cubist software and, with some changes, in Weka software ([8]). The mentioned algorithms are suboptimal, greedy, since the choice of attribute for a split node is made once and is not reconsidered. The subject at this paper is optimisation of building LMEs consisting of simple (linear) regression models. M5 algorithm is chosen as the basic greedy algorithm allowing for building LMEs of linear models. Aim is to propose an approach allowing for building optimal LMEs. M5 algorithm is analogous to ID3 decision tree algorithm of Quinlan (also greedy) in a sense that it minimizes the intrasubset variation in the output values down each branch. In each node, the standard deviation of the output values for the examples reaching a node is taken as a measure of the error of this node and calculating the expected reduction in error as a result of testing each attribute and possible split values. Such
2 split attribute together with the split value that maximize the expected error reduction are chosen for each node. The splitting process will terminate if the output values of all the instances that reach the node vary only slightly or only a few instances remain. After the initial tree has been grown, the linear regression models are generated, and, possibly, simplified, pruned and smoothed. Wang & Witten [7] reported M5 algorithm based on the original M5 algorithm but able to deal with enumerated attributes, to treat missing values and a different splitting termination condition. New instance Training Data Set a3 where M opt is a model with optimal configuration, {M k } is a set of all possible model configurations, E is a model error. For the purpose of this paper it will be assumed that M is a LME and consists of a number of individual models M i. In order to be more specific, we will limit the type of LME by assuming that a LMEs is built via a tree-like approach like M Step-wise model construction The idea of optimisation is based on a simple empirical idea aimed at avoiding solving an overall hard optimization problem by splitting the generation of a LME into two steps: 1. Global optimisation. Generate upper layers of the tree (from the 1 st layer) by a global (multi-extremum) optimization algorithm (better-than-greedy); 2. Greedy search. Generate the rest of the tree (lower layers) by a faster greedy algorithm like M5 [20], [32]. The layer up to which global optimization is applied could be different in different branches, as illustrated on Fig. 2. However, it would be reasonable to fix it at some value for all branches; in this case it will be denoted as L. This allows for a flexible trade-off between speed and optimality. a4 M3 M4 M5 M1 M2 Output Global Optimization Fig. 1: Consecutive application of rules in a tree-like structure leading to a local expert a4 a3 a5 2. Optimization of LMEs In the context of classification problems, a number of researchers aimed at improving the predictive accuracy of a tree-based model; they dealt mostly with decision trees and with greedy approaches: [5], [7], [9], [15], [19], and [23]. A rare exception of using a non-greedy approach for constructing decision trees was reported by Bennett [1]: the tree is represented as a system of linear inequalities and the system is solved using the iterative linear programming Frank-Wolfe method. The experiments indicated the advantages of the proposed method but also indicated that it may be trapped in a local minimum. To avoid that Bennett & Blue [2] used an Extreme Point Tabu Search (EPTS); it performed better than C4.5 on all 10 dataset tested. The problem of optimizing construction of local regression models like M5 tree is, however, addressed very little Optimization of LMEs The problem of building a LME can be posed in a general way ensuring that the error of the resulting overall model is minimal among all possible configurations: Find such M opt that E(M opt ) = min, M {M k } ( 1 ) Greedy algorithm a5 a4 M3 M4 M8 Fig. 2: Optimizing construction of a tree-like LME 2.3. M5opt algorithm for building LMEs The algorithmic approach presented above allows for the use of any type of the local regression model in the leaves of the generated tree. An implementation of this approach is the M5opt algorithm oriented at building the LMEs with linear regression models in the leaves, i.e. M5 model trees, and using exhaustive search is presented below. By a tree structure we understand an encoding of a tree which however does not have the associated split attributes and M9 2
3 values. By alg_parameters all the parameters specific to a particular implementation of M5 algorithm are understood. Input: instances, alg_parameters, number_of_attributes Output: Most_accurate_model_tree 1. Generate all trees structures {T i} with the user defined tree layer 2. For each valid tree structure T i do step NOnes = the number of 1 s in the current tree T i 4. Generate all possible attribute combinations corresponding to the tree nodes: A j (NOnes, number_of_attributes) 5. For each attribute combination A j do step Build the current model tree based on A j, alg_parameters 7. If the current model tree is more accurate than the Most_accurate_model_tree 8. then Replace the Most_accurate_model_tree with the current model tree 9. Stop Optimization of the upper subtree 1) Exhaustive search: The problem of overal tree optimization can be computationally costly since each attribute should be tested across a number of possible split values. The full problem of optimal binary decision tree construction (which is in essence applied in M5 as well) is reported to be NP-complete [13]. 2) M5opt with the randomized search: In order to find an estimate of the global optimum when the objective function is not known analytically, a number of methods could be applied. Random search techniques are most widely used for this purpose genetic and evolutionary algorithms, controlled random search, adaptive cluster covering, tabu search, simulated annealing, and others ([16], [24], [29]). Evolutionary and genetic algorithms (GA) are among the popular techniques of global optimization, especially for discrete problems. A chromosome (or string) cab be encoded as integer-valued vectors representing a collection of attributes with the values in the range of [0, n] where n is the number of attributes. The position of particular element in the chromosome indicates the node position in the tree. The element value is the number of the attribute selected for this node; element zero means there is no node in the corresponding position of the tree. Chromosomes not corresponding to feasible trees are discarded. Software like GLOBE [11] allowing for using a number of global optimisation algorithms can be employed. The formulation of M5opt with GA as an optimizer for building M5 trees is similar to the version of the algorithm with exhaustive search presented above Additional features of M5opt M5 builds the initial model tree in a way similar to regression trees (Breiman et al., 1984) where each node is characterized by its split attribute and value, and by the averaged output values of the instances that reach a node. This latter is used for measuring the error of the initial model tree. M5opt algorithm, however, is able to build a linear model for the instances that reach a node, directly in the initial model tree. This allows obtaining more accurate model already at initial stage. A better version of pruning (called compacting) was proposed as well. 3. Experiments For experiments we employed five benchmark data sets (Autompg, Bodyfat, CPU, Friedman and Housing) from Blake and Mertz [3], three hydrological data sets of Sieve catchment (Italy), and three hydrological data sets of Bagmati catchment (Nepal). The problem associated with hydrological data sets is to predict runoffs Q t+i several hours ahead (i=1, 3 or 6) on the basis of previous runoffs (Q t-τ ) and effective rainfalls (RE t-τ ), τ being between 0 and 2. Before building a prediction model, it was necessary to analyze the physical characteristics of the catchment and then to select the input and output variables by analyzing the inter-dependencies between variables and the lags τ using correlation and average mutual information analysis. The final forms of the model of Sieve catchment are as follows: Q t+1 = f (RE t, RE t-1, RE t-2, RE t-3, RE t-4, RE t-5, Q t, Q t-1, Q t-2 ) Q t+3 = f (RE t, RE t-1, RE t-2, RE t-3, Q t, Q t-1 ) Q t+6 = f (RE t, Q t, ) The model for Bagmati case was set to be Q t+1 = f (RE t, RE t-1, RE t-2, Q t, Q t-1 ) In Bagmati case study the data set was separated into high flows (>300 m 3 /s) and low flows, and two separate models were built. Three methods were employed: M5', M5opt and ANN (MLP). 1) M5 models were built based on default parameter values of Weka software [32]: pruning factor 2.0 and employing smoothing. The same parameter settings were also used in M5opt experiments. 2) M5opt model trees allow for setting a large number of parameters combinations. In the experiments twelve various combinations were investigated; the best combinations reported here used the exhaustive search for subtrees of up to L=3 levels. More details on the parameters settings can be found in [22]. 3) ANNs were built using NeuroSolutions [17] and NeuralMachine software [18]. The best network appeared to be a three-layered perceptron (MLP) with 18 hidden nodes and hyperbolic tangent as activation functions. The stopping criteria was either mean squared error in training reaching the threshold of or the number of epochs reaching Results and Discussion Algorithms' performance was measured by root mean squared error (RMSE) and the overall experimental results 3
4 are summarized in Table 1. M5opt model trees were the most accurate on seven data sets, and ANN on the other four. A so-called scoring matrix SM was used to present the results in a comparative way. This is a square matrix with the element SM i,j representing the average of relative performance of algorithm i compare to algorithm j with respect to all data sets used (diagonal elements are zero): N 1 RMSEk, j RMSEk, i, i j SM i j = N k= max( RMSEk j, RMSEk i ) (2), 1,, 0, i = j where N is the number of data sets. By summing up all the elements values column-wise one can determine the overall score of each algorithm, with the best algorithm having the highest positive score. M5opt has the highest score of The experiments with M5opt indicated that the use of exhaustive search in building model trees could indeed give higher accuracy. Apart from the non-greedy optimization, an additional feature that was implemented in M5opt was improved pruning (compacting) scheme. The advantages of using it are: (1) the resulting model tree can be simpler (as simple as the user wants) and (2) the model tree itself is more balanced this is desirable for the practical applications. To see the effect of optimization and compacting, compare model trees built for one of the case studies (Sieve Q t+6 ): M5 tree has 7 rules with RMSE (Fig. 3a), but M5opt has only 2 rules with RMSE (Fig. 3b). TABLE 1 RMSE OF M5, M5OPT AND ANN ON ALL DATA SETS Data sets ANN M5' M5opt Sieve catchment Train. Verif. Train. Verif. Train. Verif. Q t Q t Q t Bagmati catchment All High Low Benchmark data sets Auto-mpg Body-fat CPU Friedman Housing TABLE 2 SCORING MATRIX FOR ALL ON ALL 11 VERIFICATION DATA SETS ANN M5' M5opt ANN M5' M5opt Total The superior performance of M5opt, even being tested on a limited number of examples, allows to assume that the approach used in the proposed framework may improve accuracy of other types of tree-like LMEs like CART and C4.5. More detailed analysis of the statistical properties of the proposed framework and the error margins of its performance are yet to be done. 5. Conclusion The proposed empirical algorithmic framework, combining greedy and non-greedy approaches to building local mixtures of experts, allows for flexible trade-off between speed and optimality. Its particular implementation, M5opt algorithm, makes it possible to construct modular linear regression models (M5 model trees) that are more accurate than the traditional greedy approach of M5 and M5. The performance of M5opt with relation to ANN was investigated as well. The results indicate that M5opt outperforms M5 on all cases and outperforms ANN on 8 out of 11 cases. An important advantage of regression and model trees in comparison with ANNs is their transparent structure providing a domain expert with a simple and reproducible data-driven model ([26]). Additional computational costs associated with a higher level of optimization (problem is NP-complete if fully exhaustive search is employed [13]) can be user-controlled by selecting the appropriate tree layer until which the non-greedy search is executed, and the type of search employed (exhaustive or randomised). Research is planned to apply the proposed approach to decision and regression trees, and to include non-linear regression models like MLPs and RBF networks to serve as local experts. 4
5 a) Local expert models generated by M5 algorithm b) Local expert models generated by M5opt algorithm Fig. 3: Sieve Q t+6 cases study. Models generated by (a) M5 and (b) M5opt algorithms. Numbers in parentheses given after liner models (LM) are numbers of examples sorted to this model and, after slash, RMSE divided by average absolute deviation, expressed in %. It can be seen that M5opt generates smaller and more accurate models. References: [1] Bennett, K.P., Global tree optimization: a non-greedy decision tree algorithm, Journal of Computing Science and Statistics, 26, 1994, [2] Bennett, K.P., & Blue, J.A., Optimal decision trees. R.P.I. Math Report No. 214., [3] Blake, C.L., & Mertz, C.J., UCI Repository of machine learning databases. ~mlearn/mlrepository.html, Irvine, CA: University of California, Department of Information and Computer Science. [4] Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J., Classification and regression trees (Wadsworth International, Belmont, CA., 1984). [5] Caruana, R., & Freitag, D., Greedy attribute selection, Proc. International Conference on Machine Learning, 1994, [6] Drucker, H., Improving Regressor using Boosting. Proc. of the 14th Int. Conf. on Machine Learning, Douglas H. Fisher, Jr (Eds.), Morgan Kaufmann,1997, [7] Frank, E., & Witten, I.H., Selecting multiway splits in decision trees, Working paper 96/31, Dept. of Computer Science, University of Waikato, December, [8] Frank, E., Wang, Y., Inglis, S., Holmes, G., & Witten, I.H., Using model trees for classification, Journal of Machine Learning, 32(1), 1998, [9] Freund, Y. & Mason, L. The alternating decision tree learning algorithm. Proc. 16 th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA, 1999, [10] Friedman, J.H. Multivariate adaptive regression splines, Annals of Statistics, 19, 1991, [11] GLOBE: global and evolutionary optimisation tool, [12] Haykin, S. Neural networks. Second edition, Prentice- Hall, [13] Hyafil, L. & Rivest, R.L. Constructing Optimal Binary Decision Trees is NP-complete, Information Processing Letters, 5, 1976, [14] Jacobs, R.A., Jordan, M.I., Nowlan, S.J., & Hinton, G.E. Adaptive mixtures of local experts, Neural Computation, 3, 1991, [15] Kamber, M., Winstone, L., Gong, W., Cheng, S., & Han, J. Generalization and decision tree induction: efficient classification in data mining, Proc. of International workshop on research issues on data engineering (RIDE 97), Birmingham, England, April 1997, [16] Michalewicz, Z. Genetic algorithms + data structures = evolution programs. Third edition (Springer-Verlag, Hiedelberg, Germany, 1999). [17] NeuroSolutions software. [18] NeuralMachine software: a neural network tool. [19] Pfahringer, B., Geoffrey, H., & Kirkby, R. Optimizing the induction of alternating decision trees, Proc. of the Fifth Pasific-Asia Conf. on Advances in Knowledge Discovery and Data Mining, [20] Quinlan, J.R. Learning with continuous classes, Proc. AI 92, 5th Australian Joint Conference on Artificial Intelligence, Adams & Sterling (eds.), World Scientific, Singapore, 1992, [21] Quinlan, J.R., C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA, [22] Siek, M.B.L.A., Flexibility and optimality in model tree learning with application to water-related problems, MSc Thesis Report, IHE Delft, Netherlands, [23] Sikonja, M.R. & Kononenko, I., Pruning regression trees with MDL, ECAI 98, 13th European Conference on Artificial Intelligence,
6 [24] Solomatine, D.P., Two strategies of adaptive cluster covering with descent and their comparison to other algorithms, Journal of Global Optimization, 14(1), 1999, [25] Solomatine, D.P., Applications of data-driven modelling and machine learning in control of water resources, Computational intelligence in control, M. Mohammadian, R.A. Sarker and X. Yao (eds.), Idea Group Publishing, 2002, [26] Solomatine, D.P., & Dulal, K.N., Model tree as an alternative to neural network in rainfall-runoff modelling, Hydrological Sciences Journal, 48(3), 2003, [27] Solomatine, D.P., Mixture of simple models vs ANNs in hydrological modelling, Proc. 3 rd International Conference on Hybrid Intelligent Systems (HIS 03), Melbourne, December [28] D.P. Solomatine and D.L. Shrestha, AdaBoost.RT: a Boosting Algorithm for Regression Problems, Proc Joint Conference on Neural Networks (IJCNN- 2004), Budapest, Hungary, July 2004, [29] Törn, A. & Zilinskas, A., Global optimization, Springer-Verlag, Berlin, 1989, 255pp. [30] Utgoff, P.E., Berkman, N.C., & Clouse, J.A., Decision tree induction based on efficient tree restructuring. Journal of Machine Learning, 29(1), 1997, [31] Wang, Y., & Witten, I.H., Induction of model trees for predicting continuous classes, Proc. of the European Conference on Machine Learning, Prague, Czech Republic, 1997, [32] Witten, I.H. and Frank, E., Data Mining, Morgan Kaufmann Publishers,
FLEXIBLE AND OPTIMAL M5 MODEL TREES WITH APPLICATIONS TO FLOW PREDICTIONS
6 th International Conference on Hydroinformatics - Liong, Phoon & Babovic (eds) 2004 World Scientific Publishing Company, ISBN 981-238-787-0 FLEXIBLE AND OPTIMAL M5 MODEL TREES WITH APPLICATIONS TO FLOW
More informationImproving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique
www.ijcsi.org 29 Improving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique Anotai Siltepavet 1, Sukree Sinthupinyo 2 and Prabhas Chongstitvatana 3 1 Computer Engineering, Chulalongkorn
More informationImproving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique
Improving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique Anotai Siltepavet 1, Sukree Sinthupinyo 2 and Prabhas Chongstitvatana 3 1 Computer Engineering, Chulalongkorn University,
More informationImproving Tree-Based Classification Rules Using a Particle Swarm Optimization
Improving Tree-Based Classification Rules Using a Particle Swarm Optimization Chi-Hyuck Jun *, Yun-Ju Cho, and Hyeseon Lee Department of Industrial and Management Engineering Pohang University of Science
More informationUsing Decision Boundary to Analyze Classifiers
Using Decision Boundary to Analyze Classifiers Zhiyong Yan Congfu Xu College of Computer Science, Zhejiang University, Hangzhou, China yanzhiyong@zju.edu.cn Abstract In this paper we propose to use decision
More informationEstimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees
Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees Jing Wang Computer Science Department, The University of Iowa jing-wang-1@uiowa.edu W. Nick Street Management Sciences Department,
More informationThe digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand).
http://waikato.researchgateway.ac.nz/ Research Commons at the University of Waikato Copyright Statement: The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand). The thesis
More informationA Two Stage Zone Regression Method for Global Characterization of a Project Database
A Two Stage Zone Regression Method for Global Characterization 1 Chapter I A Two Stage Zone Regression Method for Global Characterization of a Project Database J. J. Dolado, University of the Basque Country,
More informationREGRESSION BY SELECTING APPROPRIATE FEATURE(S)
REGRESSION BY SELECTING APPROPRIATE FEATURE(S) 7ROJD$\GÕQDQG+$OWD\*üvenir Department of Computer Engineering Bilkent University Ankara, 06533, TURKEY Abstract. This paper describes two machine learning
More informationSpeeding up Logistic Model Tree Induction
Speeding up Logistic Model Tree Induction Marc Sumner 1,2,EibeFrank 2,andMarkHall 2 Institute for Computer Science University of Freiburg Freiburg, Germany sumner@informatik.uni-freiburg.de Department
More informationSpeeding Up Logistic Model Tree Induction
Speeding Up Logistic Model Tree Induction Marc Sumner 1,2,EibeFrank 2,andMarkHall 2 1 Institute for Computer Science, University of Freiburg, Freiburg, Germany sumner@informatik.uni-freiburg.de 2 Department
More informationORT EP R RCH A ESE R P A IDI! " #$$% &' (# $!"
R E S E A R C H R E P O R T IDIAP A Parallel Mixture of SVMs for Very Large Scale Problems Ronan Collobert a b Yoshua Bengio b IDIAP RR 01-12 April 26, 2002 Samy Bengio a published in Neural Computation,
More informationPerformance Analysis of Data Mining Classification Techniques
Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal
More informationA Parallel Evolutionary Algorithm for Discovery of Decision Rules
A Parallel Evolutionary Algorithm for Discovery of Decision Rules Wojciech Kwedlo Faculty of Computer Science Technical University of Bia lystok Wiejska 45a, 15-351 Bia lystok, Poland wkwedlo@ii.pb.bialystok.pl
More informationInduction of Multivariate Decision Trees by Using Dipolar Criteria
Induction of Multivariate Decision Trees by Using Dipolar Criteria Leon Bobrowski 1,2 and Marek Krȩtowski 1 1 Institute of Computer Science, Technical University of Bia lystok, Poland 2 Institute of Biocybernetics
More informationHybrid Feature Selection for Modeling Intrusion Detection Systems
Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,
More informationDynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers
Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers A. Srivastava E. Han V. Kumar V. Singh Information Technology Lab Dept. of Computer Science Information Technology Lab Hitachi
More informationInternational Journal of Software and Web Sciences (IJSWS)
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0063 ISSN (Online): 2279-0071 International
More informationLogistic Model Tree With Modified AIC
Logistic Model Tree With Modified AIC Mitesh J. Thakkar Neha J. Thakkar Dr. J.S.Shah Student of M.E.I.T. Asst.Prof.Computer Dept. Prof.&Head Computer Dept. S.S.Engineering College, Indus Engineering College
More informationData Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Search & Optimization Search and Optimization method deals with
More informationFuzzy Partitioning with FID3.1
Fuzzy Partitioning with FID3.1 Cezary Z. Janikow Dept. of Mathematics and Computer Science University of Missouri St. Louis St. Louis, Missouri 63121 janikow@umsl.edu Maciej Fajfer Institute of Computing
More informationSoftening Splits in Decision Trees Using Simulated Annealing
Softening Splits in Decision Trees Using Simulated Annealing Jakub Dvořák and Petr Savický Institute of Computer Science, Academy of Sciences of the Czech Republic {dvorak,savicky}@cs.cas.cz Abstract.
More informationConstraint Based Induction of Multi-Objective Regression Trees
Constraint Based Induction of Multi-Objective Regression Trees Jan Struyf 1 and Sašo Džeroski 2 1 Katholieke Universiteit Leuven, Dept. of Computer Science Celestijnenlaan 200A, B-3001 Leuven, Belgium
More informationStudy on Classifiers using Genetic Algorithm and Class based Rules Generation
2012 International Conference on Software and Computer Applications (ICSCA 2012) IPCSIT vol. 41 (2012) (2012) IACSIT Press, Singapore Study on Classifiers using Genetic Algorithm and Class based Rules
More informationResearch on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a
International Conference on Education Technology, Management and Humanities Science (ETMHS 2015) Research on Applications of Data Mining in Electronic Commerce Xiuping YANG 1, a 1 Computer Science Department,
More informationEfficient Pruning Method for Ensemble Self-Generating Neural Networks
Efficient Pruning Method for Ensemble Self-Generating Neural Networks Hirotaka INOUE Department of Electrical Engineering & Information Science, Kure National College of Technology -- Agaminami, Kure-shi,
More informationIndividualized Error Estimation for Classification and Regression Models
Individualized Error Estimation for Classification and Regression Models Krisztian Buza, Alexandros Nanopoulos, Lars Schmidt-Thieme Abstract Estimating the error of classification and regression models
More informationA Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York
A Systematic Overview of Data Mining Algorithms Sargur Srihari University at Buffalo The State University of New York 1 Topics Data Mining Algorithm Definition Example of CART Classification Iris, Wine
More informationGenetic Programming for Data Classification: Partitioning the Search Space
Genetic Programming for Data Classification: Partitioning the Search Space Jeroen Eggermont jeggermo@liacs.nl Joost N. Kok joost@liacs.nl Walter A. Kosters kosters@liacs.nl ABSTRACT When Genetic Programming
More informationA Systematic Overview of Data Mining Algorithms
A Systematic Overview of Data Mining Algorithms 1 Data Mining Algorithm A well-defined procedure that takes data as input and produces output as models or patterns well-defined: precisely encoded as a
More informationUnivariate and Multivariate Decision Trees
Univariate and Multivariate Decision Trees Olcay Taner Yıldız and Ethem Alpaydın Department of Computer Engineering Boğaziçi University İstanbul 80815 Turkey Abstract. Univariate decision trees at each
More informationSCENARIO BASED ADAPTIVE PREPROCESSING FOR STREAM DATA USING SVM CLASSIFIER
SCENARIO BASED ADAPTIVE PREPROCESSING FOR STREAM DATA USING SVM CLASSIFIER P.Radhabai Mrs.M.Priya Packialatha Dr.G.Geetha PG Student Assistant Professor Professor Dept of Computer Science and Engg Dept
More informationClassification Using Unstructured Rules and Ant Colony Optimization
Classification Using Unstructured Rules and Ant Colony Optimization Negar Zakeri Nejad, Amir H. Bakhtiary, and Morteza Analoui Abstract In this paper a new method based on the algorithm is proposed to
More informationEvaluating the Replicability of Significance Tests for Comparing Learning Algorithms
Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms Remco R. Bouckaert 1,2 and Eibe Frank 2 1 Xtal Mountain Information Technology 215 Three Oaks Drive, Dairy Flat, Auckland,
More informationDEVELOPMENT OF NEURAL NETWORK TRAINING METHODOLOGY FOR MODELING NONLINEAR SYSTEMS WITH APPLICATION TO THE PREDICTION OF THE REFRACTIVE INDEX
DEVELOPMENT OF NEURAL NETWORK TRAINING METHODOLOGY FOR MODELING NONLINEAR SYSTEMS WITH APPLICATION TO THE PREDICTION OF THE REFRACTIVE INDEX THESIS CHONDRODIMA EVANGELIA Supervisor: Dr. Alex Alexandridis,
More informationLook-Ahead Based Fuzzy Decision Tree Induction
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 9, NO. 3, JUNE 2001 461 Look-Ahead Based Fuzzy Decision Tree Induction Ming Dong, Student Member, IEEE, and Ravi Kothari, Senior Member, IEEE Abstract Decision
More informationUsing Turning Point Detection to Obtain Better Regression Trees
Using Turning Point Detection to Obtain Better Regression Trees Paul K. Amalaman, Christoph F. Eick and Nouhad Rizk pkamalam@uh.edu, ceick@uh.edu, nrizk@uh.edu Department of Computer Science, University
More informationRandom projection for non-gaussian mixture models
Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,
More informationICA as a preprocessing technique for classification
ICA as a preprocessing technique for classification V.Sanchez-Poblador 1, E. Monte-Moreno 1, J. Solé-Casals 2 1 TALP Research Center Universitat Politècnica de Catalunya (Catalonia, Spain) enric@gps.tsc.upc.es
More informationIntrusion detection in computer networks through a hybrid approach of data mining and decision trees
WALIA journal 30(S1): 233237, 2014 Available online at www.waliaj.com ISSN 10263861 2014 WALIA Intrusion detection in computer networks through a hybrid approach of data mining and decision trees Tayebeh
More informationBiology Project 1
Biology 6317 Project 1 Data and illustrations courtesy of Professor Tony Frankino, Department of Biology/Biochemistry 1. Background The data set www.math.uh.edu/~charles/wing_xy.dat has measurements related
More informationData Mining Practical Machine Learning Tools and Techniques
Decision trees Extending previous approach: Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank to permit numeric s: straightforward
More informationUSING REGRESSION TREES IN PREDICTIVE MODELLING
Production Systems and Information Engineering Volume 4 (2006), pp. 115-124 115 USING REGRESSION TREES IN PREDICTIVE MODELLING TAMÁS FEHÉR University of Miskolc, Hungary Department of Information Engineering
More informationForward Feature Selection Using Residual Mutual Information
Forward Feature Selection Using Residual Mutual Information Erik Schaffernicht, Christoph Möller, Klaus Debes and Horst-Michael Gross Ilmenau University of Technology - Neuroinformatics and Cognitive Robotics
More informationSandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing
Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications
More informationA Two-level Learning Method for Generalized Multi-instance Problems
A wo-level Learning Method for Generalized Multi-instance Problems Nils Weidmann 1,2, Eibe Frank 2, and Bernhard Pfahringer 2 1 Department of Computer Science University of Freiburg Freiburg, Germany weidmann@informatik.uni-freiburg.de
More informationAn Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm
Proceedings of the National Conference on Recent Trends in Mathematical Computing NCRTMC 13 427 An Effective Performance of Feature Selection with Classification of Data Mining Using SVM Algorithm A.Veeraswamy
More informationFlexible-Hybrid Sequential Floating Search in Statistical Feature Selection
Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Petr Somol 1,2, Jana Novovičová 1,2, and Pavel Pudil 2,1 1 Dept. of Pattern Recognition, Institute of Information Theory and
More informationCse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University
Cse634 DATA MINING TEST REVIEW Professor Anita Wasilewska Computer Science Department Stony Brook University Preprocessing stage Preprocessing: includes all the operations that have to be performed before
More informationIMPLEMENTATION OF CLASSIFICATION ALGORITHMS USING WEKA NAÏVE BAYES CLASSIFIER
IMPLEMENTATION OF CLASSIFICATION ALGORITHMS USING WEKA NAÏVE BAYES CLASSIFIER N. Suresh Kumar, Dr. M. Thangamani 1 Assistant Professor, Sri Ramakrishna Engineering College, Coimbatore, India 2 Assistant
More informationChapter 12 Feature Selection
Chapter 12 Feature Selection Xiaogang Su Department of Statistics University of Central Florida - 1 - Outline Why Feature Selection? Categorization of Feature Selection Methods Filter Methods Wrapper Methods
More informationTechnical Note Using Model Trees for Classification
c Machine Learning,, 1 14 () Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Technical Note Using Model Trees for Classification EIBE FRANK eibe@cs.waikato.ac.nz YONG WANG yongwang@cs.waikato.ac.nz
More informationDATA ANALYSIS I. Types of Attributes Sparse, Incomplete, Inaccurate Data
DATA ANALYSIS I Types of Attributes Sparse, Incomplete, Inaccurate Data Sources Bramer, M. (2013). Principles of data mining. Springer. [12-21] Witten, I. H., Frank, E. (2011). Data Mining: Practical machine
More informationRank Measures for Ordering
Rank Measures for Ordering Jin Huang and Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 email: fjhuang33, clingg@csd.uwo.ca Abstract. Many
More informationCS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008
CS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008 Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute NAME: Prof. Ruiz Problem
More informationInducing Cost-Sensitive Trees via Instance Weighting
Inducing Cost-Sensitive Trees via Instance Weighting Kai Ming Ting School of Computing and Mathematics, Deakin University, Vic 3168, Australia. Abstract. We introduce an instance-weighting method to induce
More informationUnivariate Margin Tree
Univariate Margin Tree Olcay Taner Yıldız Department of Computer Engineering, Işık University, TR-34980, Şile, Istanbul, Turkey, olcaytaner@isikun.edu.tr Abstract. In many pattern recognition applications,
More informationData Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Implementation: Real machine learning schemes Decision trees Classification
More informationPattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition
Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant
More informationUnsupervised Discretization using Tree-based Density Estimation
Unsupervised Discretization using Tree-based Density Estimation Gabi Schmidberger and Eibe Frank Department of Computer Science University of Waikato Hamilton, New Zealand {gabi, eibe}@cs.waikato.ac.nz
More informationAn Improved Apriori Algorithm for Association Rules
Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan
More information7. Decision or classification trees
7. Decision or classification trees Next we are going to consider a rather different approach from those presented so far to machine learning that use one of the most common and important data structure,
More informationBinary Representations of Integers and the Performance of Selectorecombinative Genetic Algorithms
Binary Representations of Integers and the Performance of Selectorecombinative Genetic Algorithms Franz Rothlauf Department of Information Systems University of Bayreuth, Germany franz.rothlauf@uni-bayreuth.de
More informationEnhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques
24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE
More informationCalibrating Random Forests
Calibrating Random Forests Henrik Boström Informatics Research Centre University of Skövde 541 28 Skövde, Sweden henrik.bostrom@his.se Abstract When using the output of classifiers to calculate the expected
More informationBuilding Classifiers using Bayesian Networks
Building Classifiers using Bayesian Networks Nir Friedman and Moises Goldszmidt 1997 Presented by Brian Collins and Lukas Seitlinger Paper Summary The Naive Bayes classifier has reasonable performance
More informationNon-linear gating network for the large scale classification model CombNET-II
Non-linear gating network for the large scale classification model CombNET-II Mauricio Kugler, Toshiyuki Miyatani Susumu Kuroyanagi, Anto Satriyo Nugroho and Akira Iwata Department of Computer Science
More informationAnalyzing Outlier Detection Techniques with Hybrid Method
Analyzing Outlier Detection Techniques with Hybrid Method Shruti Aggarwal Assistant Professor Department of Computer Science and Engineering Sri Guru Granth Sahib World University. (SGGSWU) Fatehgarh Sahib,
More informationThe Role of Biomedical Dataset in Classification
The Role of Biomedical Dataset in Classification Ajay Kumar Tanwani and Muddassar Farooq Next Generation Intelligent Networks Research Center (nexgin RC) National University of Computer & Emerging Sciences
More informationHeuristic Rule-Based Regression via Dynamic Reduction to Classification Frederik Janssen and Johannes Fürnkranz
Heuristic Rule-Based Regression via Dynamic Reduction to Classification Frederik Janssen and Johannes Fürnkranz September 28, 2011 KDML @ LWA 2011 F. Janssen & J. Fürnkranz 1 Outline 1. Motivation 2. Separate-and-conquer
More informationEfficient Pairwise Classification
Efficient Pairwise Classification Sang-Hyeun Park and Johannes Fürnkranz TU Darmstadt, Knowledge Engineering Group, D-64289 Darmstadt, Germany Abstract. Pairwise classification is a class binarization
More informationCyber attack detection using decision tree approach
Cyber attack detection using decision tree approach Amit Shinde Department of Industrial Engineering, Arizona State University,Tempe, AZ, USA {amit.shinde@asu.edu} In this information age, information
More informationMetaData for Database Mining
MetaData for Database Mining John Cleary, Geoffrey Holmes, Sally Jo Cunningham, and Ian H. Witten Department of Computer Science University of Waikato Hamilton, New Zealand. Abstract: At present, a machine
More informationGenerating Rule Sets from Model Trees
Generating Rule Sets from Model Trees Geoffrey Holmes, Mark Hall and Eibe Frank Department of Computer Science University of Waikato, New Zealand {geoff,mhall,eibe}@cs.waikato.ac.nz Ph. +64 7 838-4405
More informationData Mining. 3.2 Decision Tree Classifier. Fall Instructor: Dr. Masoud Yaghini. Chapter 5: Decision Tree Classifier
Data Mining 3.2 Decision Tree Classifier Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Basic Algorithm for Decision Tree Induction Attribute Selection Measures Information Gain Gain Ratio
More informationImproving the Random Forest Algorithm by Randomly Varying the Size of the Bootstrap Samples for Low Dimensional Data Sets
Improving the Random Forest Algorithm by Randomly Varying the Size of the Bootstrap Samples for Low Dimensional Data Sets Md Nasim Adnan and Md Zahidul Islam Centre for Research in Complex Systems (CRiCS)
More informationBackpropagation in Decision Trees for Regression
Backpropagation in Decision Trees for Regression Victor Medina-Chico 1, Alberto Suárez 1, and James F. Lutsko 2 1 Escuela Técnica Superior de Informática Universidad Autónoma de Madrid Ciudad Universitaria
More informationDESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES
EXPERIMENTAL WORK PART I CHAPTER 6 DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES The evaluation of models built using statistical in conjunction with various feature subset
More informationA Comparison of Decision Tree Algorithms For UCI Repository Classification
A Comparison of Decision Tree Algorithms For UCI Repository Classification Kittipol Wisaeng Mahasakham Business School (MBS), Mahasakham University Kantharawichai, Khamriang, Mahasarakham, 44150, Thailand.
More informationCOMP 465: Data Mining Classification Basics
Supervised vs. Unsupervised Learning COMP 465: Data Mining Classification Basics Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Supervised
More informationMachine Learning Techniques for Data Mining
Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already
More informationAlgorithms: Decision Trees
Algorithms: Decision Trees A small dataset: Miles Per Gallon Suppose we want to predict MPG From the UCI repository A Decision Stump Recursion Step Records in which cylinders = 4 Records in which cylinders
More informationAdditive Regression Applied to a Large-Scale Collaborative Filtering Problem
Additive Regression Applied to a Large-Scale Collaborative Filtering Problem Eibe Frank 1 and Mark Hall 2 1 Department of Computer Science, University of Waikato, Hamilton, New Zealand eibe@cs.waikato.ac.nz
More informationRandom Forest A. Fornaser
Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University
More informationData Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation
Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization
More informationIMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM
Annals of the University of Petroşani, Economics, 12(4), 2012, 185-192 185 IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM MIRCEA PETRINI * ABSTACT: This paper presents some simple techniques to improve
More informationSupervised Learning. Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression...
Supervised Learning Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression... Supervised Learning y=f(x): true function (usually not known) D: training
More informationInternational Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X
Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,
More informationPerformance analysis of a MLP weight initialization algorithm
Performance analysis of a MLP weight initialization algorithm Mohamed Karouia (1,2), Régis Lengellé (1) and Thierry Denœux (1) (1) Université de Compiègne U.R.A. CNRS 817 Heudiasyc BP 49 - F-2 Compiègne
More informationComparing Univariate and Multivariate Decision Trees *
Comparing Univariate and Multivariate Decision Trees * Olcay Taner Yıldız, Ethem Alpaydın Department of Computer Engineering Boğaziçi University, 80815 İstanbul Turkey yildizol@cmpe.boun.edu.tr, alpaydin@boun.edu.tr
More informationHybrid Approach for Classification using Support Vector Machine and Decision Tree
Hybrid Approach for Classification using Support Vector Machine and Decision Tree Anshu Bharadwaj Indian Agricultural Statistics research Institute New Delhi, India anshu@iasri.res.in Sonajharia Minz Jawaharlal
More informationClassification and Regression Trees
Classification and Regression Trees David S. Rosenberg New York University April 3, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 April 3, 2018 1 / 51 Contents 1 Trees 2 Regression
More informationGraph Propositionalization for Random Forests
Graph Propositionalization for Random Forests Thashmee Karunaratne Dept. of Computer and Systems Sciences, Stockholm University Forum 100, SE-164 40 Kista, Sweden si-thk@dsv.su.se Henrik Boström Dept.
More informationA Comparison of Text-Categorization Methods applied to N-Gram Frequency Statistics
A Comparison of Text-Categorization Methods applied to N-Gram Frequency Statistics Helmut Berger and Dieter Merkl 2 Faculty of Information Technology, University of Technology, Sydney, NSW, Australia hberger@it.uts.edu.au
More informationOptimal Extension of Error Correcting Output Codes
Book Title Book Editors IOS Press, 2003 1 Optimal Extension of Error Correcting Output Codes Sergio Escalera a, Oriol Pujol b, and Petia Radeva a a Centre de Visió per Computador, Campus UAB, 08193 Bellaterra
More informationCloNI: clustering of JN -interval discretization
CloNI: clustering of JN -interval discretization C. Ratanamahatana Department of Computer Science, University of California, Riverside, USA Abstract It is known that the naive Bayesian classifier typically
More informationExperimental analysis of methods for imputation of missing values in databases
Experimental analysis of methods for imputation of missing values in databases Alireza Farhangfar a, Lukasz Kurgan b, Witold Pedrycz c a IEEE Student Member (farhang@ece.ualberta.ca) b IEEE Member (lkurgan@ece.ualberta.ca)
More informationData Mining: Concepts and Techniques Classification and Prediction Chapter 6.7
Data Mining: Concepts and Techniques Classification and Prediction Chapter 6.7 March 1, 2007 CSE-4412: Data Mining 1 Chapter 6 Classification and Prediction 1. What is classification? What is prediction?
More informationObservational Learning with Modular Networks
Observational Learning with Modular Networks Hyunjung Shin, Hyoungjoo Lee and Sungzoon Cho {hjshin72, impatton, zoon}@snu.ac.kr Department of Industrial Engineering, Seoul National University, San56-1,
More informationMySQL Data Mining: Extending MySQL to support data mining primitives (demo)
MySQL Data Mining: Extending MySQL to support data mining primitives (demo) Alfredo Ferro, Rosalba Giugno, Piera Laura Puglisi, and Alfredo Pulvirenti Dept. of Mathematics and Computer Sciences, University
More information