Feature transformation by function decomposition
|
|
- Bennett Patrick
- 5 years ago
- Views:
Transcription
1 To appear in IEEE Expert Spec Issue on Feature Transformation and Subset Selection (H Liu, M Motoda, eds) Feature transformation by function decomposition Blaž Zupan, Marko Bohanec, Janez Demšar, Ivan Bratko While not explicitly intended for feature transformation, some methods for switching circuit design implicitly deal with this problem Given a tabulated Boolean function, these methods construct a circuit that implements that function In 1950s and 1960s, Ashenhurst [1] and Curtis [2] proposed a function decomposition method that develops a switching circuit by constructing a nested hierarchy of tabulated Boolean functions Both the hierarchy and the functions themselves are discovered by the decomposition method and are not given in advance This is especially important from the viewpoint of feature construction, since the outputs of such functions can be regarded as new features not present in the original problem description The basic principle of function decomposition is the following Let a tabulated function y = F (X) use a set of input features X = x 1,, x n The goal is to decompose this function into y = G(A, H(B)), where A and B are subsets of features in X such that A B = X G and H are tabulated functions that are determined by the decomposition and are not predefined Their joint complexity (determined by some complexity measure) should be lower than the complexity of F Such a decomposition also discovers a new feature c = H(B) Since the decomposition can be applied recursively on H and G, the result in general is a hierarchy of features For each feature in the hierarchy, there is a corresponding tabulated function (such as H(B)) that determines the dependency of that feature on its immediate descendants in the hierarchy Ashenhurst-Curtis decomposition was intended for switching circuit design of completely specified Boolean functions Recently, Wan and Perkowski [3] extended the decomposition to handle incompletely specified Boolean functions In the framework of feature extraction, Ross et al [4] used a set of simple Boolean functions to show the decomposition s capability to discover and extract useful features This article presents an approach to feature transfor- 1
2 mation that is based on the extension of the function decomposition method by Zupan et al [5] This method allows the decomposition to deal with functions that involve nominal (ie, not necessarily binary) features, and is implemented in a system called HINT(Hierarchy INduction Tool) 1 Single-step function decomposition The core of the decomposition algorithm is a single-step decomposition which decomposes a tabulated function y = F (X) into two possibly less complex tabulated functions G and H, so that y = G(A, c) and c = H(B) The resulting functions G and H have to be consistent with F For the purpose of decomposition, a set of features X is partitioned to two disjoint subsets A and B, referred to as a free and bound set, respectively For a given feature partition, single-step decomposition discovers a new feature c and a tabular representation of H and G For example, consider a function y = F (x 1, x 2, x 3 ) as given in Table 1 The input features x 1, x 2, and the output feature y can take the values lo, med, hi; input feature x 3 can take the values lo, hi x 1 x 2 x 3 y lo lo lo lo lo lo hi lo lo med lo lo lo med hi med lo hi lo lo lo hi hi hi med med lo med med hi lo med med hi hi hi hi lo lo hi hi hi lo hi Table 1: Tabulated function y = F (x 1, x 2, x 3 ) Suppose that we want to discover a description of a new feature c = H(x 2, x 3 ) For this purpose, we represent F by a partition matrix that uses the values of x 1 for row labels, and the combinations of values of x 2 and x 3 for column labels (Figure 1a) Partition matrix entries with no corresponding instance in the tabulated representation of F are denoted with - and treated as don t-care 2
3 x 2 lo lo med med hi hi x 1 x 3 lo hi lo hi lo hi lo lo lo lo med lo hi med - - med - med hi hi hi hi - c hi,hi hi,lo 1 1 lo,lo med,hi 2 1 lo,hi med,lo 1 (a) (b) Figure 1: Partition matrix (a) and the corresponding incompatibility graph (b) for the function from Table 1 The node labels of the incompatibility graph found by graph coloring are circled Each column in the partition matrix denotes the behavior of F for a specific combination of x 2 and x 3 Columns that have pairwise equal row entries or at least one row entry is a don t-care are called compatible The decomposition has to assign each column a label that corresponds to a value of c If two columns are compatible, they can be assigned the same label To preserve the consistency, different labels have to be used for incompatible columns The partition matrix column labels are found by coloring an incompatibility graph (Figure 1b) This has a distinct node for each column of the partition matrix Two nodes are connected if the corresponding columns of partition matrix are incompatible For instance, the nodes hi,hi and lo,hi are connected because their corresponding columns are incompatible due to the entries in the first row (hi lo) With optimal coloring of the incompatibility graph, the new feature c obtains a minimal set of values needed for consistent derivation of H and G from F The optimal coloring of the graph from Figure 1b requires three different colors, ie, three abstract values for c The tabulated functions G and H can then be derived straightforwardly from the labeled partition matrix The resulting decomposition is given in Figure 2 The following can be observed: The tabulated functions G and H are overall smaller than the original function F 3
4 x1 y x 1 c y lo 1 lo lo 2 med lo 3 hi med 1 med med 3 hi hi 1 hi c x 2 x 3 c lo lo 1 lo hi 1 med lo 1 med hi 2 hi lo 1 hi hi 3 x2 x3 Figure 2: Decomposition of tabulated function from Table 1 using a new feature c = H(x 2, x 3 ) By combining the features x 2 and x 3 we obtained a new feature c and the corresponding tabulated function H, which can be interpreted as c = MIN(x 2, x 3 ) Similarly, the function G that relates x 1 and c to y can be interpreted as y = MAX(x 1, c) The decomposition generalizes some undefined entries of the partition matrix For example, F (hi, lo, hi) is generalized to hi because the column lo,hi has the same label as columns lo,lo and hi,lo 2 Finding the best feature partition So far, we have assumed that a partition of the features to free and bound sets is given However, for each function F there are many possible partitions, each one yielding a different feature c and a different pair of functions G and H In feature transformation, it is important to identify partitions that lead to simple but useful new features Typically, we prefer features with a small number of values, and those that yield functions G and H of low complexity The simplest measure for finding good feature partitions is the number of values required for the new feature c That is, when decomposing F, a set of 4
5 candidate partitions is examined and the one that yields c with a smallest set of possible values is selected for decomposition An alternative measure for the selection of partitions can be based on the complexity of functions Let I(F ) denote a number of bits to encode some function F [6, 4] Then, the best partition is the one that minimizes the overall complexity of newly discovered functions H and G, ie, the partition with minimal I(H) + I(G) The complexity of G and H should be lower than that of F, and decomposition takes place only if I(H) + I(G) < I(F ) for the best partition The number of possible partitions increases exponentially with the number of input features To limit the complexity of search, the decomposition should examine only a subset of partitions In HINT, the partitions examined are only those with less than b features in the bound set, where b is a user-defined parameter usually set to 2 or 3 The partition matrix can be very large However, it is never explicitly represented in the program as HINT derives the incompatibility graph directly from the data [6] Partition matrix is thus a formal construct used for explanation or analysis of the decomposition algorithm The coloring of the incompatibility graph is another potential bottleneck HINT uses an efficient heuristic algorithm for graph coloring [6] 3 Redundancy discovery and removal A special characteristic of function decomposition is its capability to identify and remove redundant features and/or their values Namely, whenever a feature c = H(x i ) is discovered such that c uses only a single value (ie, H is a constant function), the feature x i is redundant and can be removed from the dataset Similarly, whenever a feature c = H(x i ) uses less values than x i, the original feature x i can be replaced by c Both these dataset transformations are particularly useful for data preprocessing, since they can reduce the size of the problem and increase the coverage of the problem space by the learning examples HINT removes redundant features in the increased order of their relevance: the less relevant features are checked first, and if any type of redundancy mentioned above is discovered, the dataset is accordingly transformed The features are estimated by a ReliefF measure of relevance [7] 5
6 MONK1 c x 5 MONK c x 5 x 1 x 2 c x 5 x x x 1 x 2 Figure 3: MONK1 feature hierachy as discovered by HINT 4 Discovering feature hierarchies With all the above ingredients, decomposition enables the discovery of feature hierarchies Given a dataset, it is first pre-processed to remove redundant features and their values Next, a single-step decomposition is used to decompose the pre-processed tabulated function F to G and H This process is recursively repeated on G and H until they can not be decomposed further, ie, their further decomposition would increase the overall complexity of resulting functions Let us illustrate this process with several examples The first example is a well-known machine learning problem MONK1 [8] The dataset uses six 2 to 4-valued input features (x 1 to x 6 ) and contains 124 (of 435 possible) distinct learning examples that partially define the target concept x 1 = x 2 OR x 5 = 1 HINT develops a feature hierarchy (Figure 3) that 1 correctly excludes irrelevant input features x 3, x 4, and x 6, 2 transforms x 5 to x 5 by mapping four values of x 5 to only two values of x 5, 6
7 3 includes a new feature c and its tabular representation for x 1 = x 2, and 4 relates c and x 5 with a tabular representation of the OR function In other words, the resulting hierarchy correctly represents the target concept Similarly as MONK1, the MONK2 learning problem [8] uses the same set of input features but defines the concept: x i = 1 for exactly two choices of i {1, 2,, 6} The dataset contains 172 (of 435 possible) distinct learning examples Although the discovered structure (Figure 4) does not directly correspond to the original concept definition, it correctly reformulates the target concept by introducing features that count the number of ones in their arguments Also note that all input features that use more than two values are replaced by new binary features Figure 4: The feature hierarchy discovered for MONK2 Each node gives a name of the feature and cardinality of its set of values For a real-world example, consider a HOUSING dataset from a management decision support system for allocating housing loans This system was developed for the Housing Fund of Slovenia and used since 1991 in 13 floats of loans with a total value of approximately 90 million ECU Basically, its task is to rank applicants into one of nine priority classes based on 12 input features The system uses a hierarchical decision model For this experiment, we wanted to assess the ability of decomposition to reconstruct the system s model given a random selection of classified cases 7
8 Figure 5: The feature structure discovered for housing loan allocation problem Each node gives a name of the feature and cardinality of its set of values CA [%] HINT 85 C45 80 p [%] Figure 6: Averaged classification accuracy of HINT and C45 for 10 experiments with learning sets consisting of a random sample of p percents of the housing problem space 8
9 First, it was observed that in most cases when HINT was given 8000 or more learning examples (4% coverage of problem space), it discovered the same feature hierarchy (Figure 5) When presented to a domain expert, the discovered features were found meaningful For instance, the feature c 4 represents the present housing status of the applicant Features c 2, c 6, and c 8 respectively specify the way of solving the housing problem, applicant s current status, and his/her social and health conditions Thus, the structure of the original model was successfully reconstructed This reconstruction takes about 25 minutes of processor time on a HP J210 Unix workstation The quality of the functions discovered for housing problem was then assessed by classification of the remaining examples that completely cover this domain but were not included in the learning set The generalization of HINT was assessed as a learning curve (Figure 6) and compared to that of the stateof-the-art learning tool C45 [9] that induces decision trees For this domain, HINT generalizes well as it achieves 100% classification accuracy by using learning sets that cover 3 or more percents of the problem space Housing problem domain can be viewed as a typical representative of practical domains in the field of decision making that HINT was tested on Similar tests and experimental conclusion were for instance obtained by re-discovery of hierarchical decision models for nursing school applications, job application evaluation, and performance evaluation of public enterprises [6] 5 Feature construction Function decomposition can also be used for feature construction This is defined as a process of augmenting the feature space by inferring or creating additional features We here show how the single-step decomposition can be used both for finding appropriate combinations of original features and using them to construct new features which are then added to the original dataset Again, the new features that use small number of values are preferred For example, consider again the dataset from Table 1 Suppose we want to augment it with a single new feature For this purpose, single-step decomposition examines all pairs of original input features and for each pair derives a new feature A new feature that has the fewest values is preferred Such feature is c = H(x 2, x 3 ), and is defined by the example set from Figure 2 Function H can now be used for each example from Table 1 to obtain the corresponding 9
10 x 1 x 2 x 3 c y lo lo lo 1 lo lo lo hi 1 lo lo med lo 1 lo lo med hi 2 med lo hi lo 1 lo lo hi hi 3 hi med med lo 1 med med hi lo 1 med med hi hi 3 hi hi lo lo 1 hi hi hi lo 1 hi Table 2: A dataset from Table 1 augmented with a new feature c = H(x 2, x 3 ) value for c; such augmented dataset is shown in Table 2 Obviously, an original dataset can be augmented with more than just a single new feature We propose a schema where, using the single-step decomposition, m new features are added to a dataset: a candidate set of combinations of the original features are examined, and corresponding m new features that have the fewest values are selected Ties are resolved arbitrarily Only original features are used to generate new features Note also that in the process no hierarchy of features is constructed The feature construction and corresponding augmentation of the original dataset results in a new dataset which may then be further used by some machine learning algorithm We evaluate this schema on the above-mentioned domains For new features, only those that depend only on two original input features are considered For MONK1 and MONK2 domains, the benefit of adding new features was assessed by a 10-fold cross validation For HOUSING, 10 experiments were performed with a learning set consisting of a random sample of 1% of the problem space, while the test set consisted of the remaining 99% Each learning set was first augmented with m new features as described above, and then given to C45 [9] which was used to induce the target concept from the augmented dataset The induced decision tree was then tested on a test set, which was also augmented by the same set of new features, ie, using their definition as derived from the learning set In all the cases, a considerable increase of classification accuracy (Figure 7) and considerable decrease of the sizes of induced decision trees (Figure 8) occur when new features are added In other words, adding new features improves 10
11 CA [%] MONK MONK m CA [%] HOUSING Figure 7: Classification accuracy of C45 as a function of number of new features m added to MONK1 and MONK2 (a) and housing loans dataset (b) m N MONK N 40 0 MONK1 m HOUSING m Figure 8: Number of nodes N of C45-induced decision trees as a function of number of new features m added to MONK1 and MONK2 (a) and housing loans dataset (b) 11
12 both the classification accuracy and the transparency of decision trees The results indicate that function decomposition discovers features that are relevant for these domains It is further interesting to note how well a simple feature selection criterion based on the number of new feature s values performs in terms of yielding augmented datasets that enable C45 to derive classifiers with high classification accuracies 6 Further work In the framework of feature transformation, function decomposition is a promising approach to discover and construct new features that are either added to the original dataset, or transform this dataset to a hierarchy of less complex datasets The function decomposition algorithm in HINT performs a kind of generalization, so it can also be viewed as a machine learning algorithm The approach described in this article is limited to consistent datasets and nominal features It is therefore desired to extend the approach to discover new features from noisy data, and from data that uses continuous features To handle noisy data, a minimal-error decomposition was recently proposed [6] It is based on a representation of learning examples with class distributions and uses successive column merging of partition matrix, so that the expected error of classification is minimized So far, the method was evaluated only as a machine learning tool, while further work is needed to assess its appropriateness for feature transformation problems References [1] R L Ashenhurst, The decomposition of switching functions, tech rep, Bell Laboratories BL-1(11), pages , 1952 [2] H A Curtis, A New Approach to the Design of Switching Functions Van Nostrand, Princeton, NJ, 1962 [3] W Wan and M A Perkowski, A new approach to the decomposition of incompletely specified functions based on graph-coloring and local transformations and its application to FPGA mapping, in Proc of the IEEE EURO-DAC 92, (Hamburg), pp ,
13 [4] T D Ross, M J Noviskey, D A Gadd, and J A Goldman, Pattern theoretic feature extraction and constructive induction, in Proc ML-COLT 94 Workshop on Constructive Induction and Change of Representation, (New Brunswick, New Jersey), July 1994 [5] B Zupan, M Bohanec, I Bratko, and J Demšar, Machine learning by function decomposition, in Proc Fourteenth International Conference on Machine Learning (J D H Fisher, ed), (San Mateo, CA), pp , Morgan Kaufmann, 1997 [6] B Zupan, Machine learning based on function decomposition PhD thesis, University of Ljubljana, Apr 1997 Available at [7] I Kononenko, Estimating attributes, in Proc of the European Conference on Machine Learning (ECML-94) (F Bergadano and L D Raedt, eds), vol 784 of Lecture Notes in Artificial Intelligence, (Catania), pp , Springer-Verlag, 1994 [8] S B Thrun et al, A performance comparison of different learning algorithms, tech rep, Carnegie Mellon University CMU-CS , 1991 [9] J R Quinlan, C45: Programs for Machine Learning Morgan Kaufmann Publishers,
14 Blaž Zupan is a research assistant at Jožef Stefan Institute, Ljubljana, Slovenia His research interest include machine learning, intelligent data analysis and knowledge discovery in databases, computer-aided support for medical decision making, and medical informatics He has his MS in computer science from University of Houston, Texas, and PhD from University of Ljubljana, Slovenia He can be reached at Department of Intelligent Systems, J Stefan Institute, Jamova 39, SI-1000 Ljubljana, Slovenia; blazzupan@ijssi; Marko Bohanec is a research associate at Jožef Stefan Institute, Ljubljana, Slovenia, and assistant professor in information systems at the Faculty of Organizational Sciences, University of Maribor His research and development interests are in decision support systems and machine learning He obtained his PhD in Computer Science at the Faculty of Electrical Engineering and Computer Science, University of Ljubljana He has published in journals such as Machine Learning, Acta Psychologica and Information & Management Contact him at Department of Intelligent Systems, J Stefan Institute, Jamova 39, SI-1000 Ljubljana, Slovenia; markobohanec@ijssi; Ivan Bratko is a professor of computer science at the Faculty of Computer and Information Science, University of Ljubljana, Slovenia He heads the AI laboratories at Jožef Stefan Institute and the University He is the chairman of ISSEK, International School for the Synthesis of Expert Knowledge He has conducted research in machine learning, knowledge-based systems, qualitative modeling, intelligent robotics, heuristic programming and computer chess His main interests in machine learning have been in learning from noisy data, combining learning and qualitative reasoning, and various applications of machine learning and Inductive Logic Programming, including medicine, ecological modeling and control of dynamic systems He is the author of widely adopted text PROLOG Programming for Artificial Intelligence (Addison-Wesley 1986, second edition 1990) He co-edited (with N Lavrač) Progress in Machine Learning (Sigma Press 1987) and co-authored (with I Mozetič and N Lavrač) KARDIO: a Study in Deep and Qualitative Knowledge for Expert Systems (MIT Press, 1989) Contact him at the Faculty of Computer and Information Science, University of Ljubljana, Tržaška 25, SI-1000 Ljubljana, Slovenia; ivanbratko@friuni-ljsi Janez Demšar is a research assistant at the Faculty of Computer and Information Science, University of Ljubljana, Slovenia He has BSc in computer science from the same faculty His research interest are machine learning and intelligent data analysis He can be reached at the Faculty of Computer and Information Science, University of Ljubljana, Tržaška 25, SI-1000 Ljubljana, Slovenia; janezdemsar@friuni-ljsi; 14
Learning by discovering concept hierarchies
Artificial Intelligence 109 (1999) 211 242 Learning by discovering concept hierarchies Blaž Zupan a,b,, Marko Bohanec b,1, Janez Demšar a,ivanbratko a,b a Faculty of Computer and Information Sciences,
More information1 Introduction A decision table provides a simple means for concept representation It represents a dataset with labeled instances, each relating a set
Experimental evaluation of three partition selection criteria for decision table decomposition Blaz Zupan and Marko Bohanec Department of Intelligent Systems Jozef Stefan Institute, 1000 Ljubljana, Slovenia
More informationEstimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees
Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees Jing Wang Computer Science Department, The University of Iowa jing-wang-1@uiowa.edu W. Nick Street Management Sciences Department,
More informationMining High Order Decision Rules
Mining High Order Decision Rules Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 e-mail: yyao@cs.uregina.ca Abstract. We introduce the notion of high
More informationStochastic propositionalization of relational data using aggregates
Stochastic propositionalization of relational data using aggregates Valentin Gjorgjioski and Sašo Dzeroski Jožef Stefan Institute Abstract. The fact that data is already stored in relational databases
More informationOptimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C
Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of California, San Diego CA 92093{0114, USA Abstract. We
More informationRough Set Approaches to Rule Induction from Incomplete Data
Proceedings of the IPMU'2004, the 10th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Perugia, Italy, July 4 9, 2004, vol. 2, 923 930 Rough
More informationImproving the Efficiency of Fast Using Semantic Similarity Algorithm
International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year
More informationConstraint Based Induction of Multi-Objective Regression Trees
Constraint Based Induction of Multi-Objective Regression Trees Jan Struyf 1 and Sašo Džeroski 2 1 Katholieke Universiteit Leuven, Dept. of Computer Science Celestijnenlaan 200A, B-3001 Leuven, Belgium
More informationEfficient SQL-Querying Method for Data Mining in Large Data Bases
Efficient SQL-Querying Method for Data Mining in Large Data Bases Nguyen Hung Son Institute of Mathematics Warsaw University Banacha 2, 02095, Warsaw, Poland Abstract Data mining can be understood as a
More informationAn Information-Theoretic Approach to the Prepruning of Classification Rules
An Information-Theoretic Approach to the Prepruning of Classification Rules Max Bramer University of Portsmouth, Portsmouth, UK Abstract: Keywords: The automatic induction of classification rules from
More informationImproving Classifier Performance by Imputing Missing Values using Discretization Method
Improving Classifier Performance by Imputing Missing Values using Discretization Method E. CHANDRA BLESSIE Assistant Professor, Department of Computer Science, D.J.Academy for Managerial Excellence, Coimbatore,
More informationFuzzy Partitioning with FID3.1
Fuzzy Partitioning with FID3.1 Cezary Z. Janikow Dept. of Mathematics and Computer Science University of Missouri St. Louis St. Louis, Missouri 63121 janikow@umsl.edu Maciej Fajfer Institute of Computing
More informationTrading Accuracy for Simplicity in Decision Trees
Machine Learning, 15,223-250 (1994) 1994 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Trading Accuracy for Simplicity in Decision Trees MARKO BOHANEC "Joief Stefan" Institute, Jamova
More informationA Model of Machine Learning Based on User Preference of Attributes
1 A Model of Machine Learning Based on User Preference of Attributes Yiyu Yao 1, Yan Zhao 1, Jue Wang 2 and Suqing Han 2 1 Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada
More informationFunctional Decomposition of MVL Functions using Multi-Valued Decision Diagrams
Functional ecomposition of MVL Functions using Multi-Valued ecision iagrams Craig Files Rolf rechsler Marek A. Perkowski epartment of Electrical Engineering Institute of Computer Science Portland State
More informationForward Feature Selection Using Residual Mutual Information
Forward Feature Selection Using Residual Mutual Information Erik Schaffernicht, Christoph Möller, Klaus Debes and Horst-Michael Gross Ilmenau University of Technology - Neuroinformatics and Cognitive Robotics
More informationFilter methods for feature selection. A comparative study
Filter methods for feature selection. A comparative study Noelia Sánchez-Maroño, Amparo Alonso-Betanzos, and María Tombilla-Sanromán University of A Coruña, Department of Computer Science, 15071 A Coruña,
More informationmethod is described in detail in section 3, and eerimentally evaluated in section 4 on several domains of dierent comleity The aer is concluded by a s
To aear in Proc ICML-97 Machine Learning by Function Decomosition Blaz Zuan Jozef Stefan Institute Ljubljana, Slovenia blazzuan@ijssi Marko Bohanec Jozef Stefan Institute Ljubljana, Slovenia markobohanec@ijssi
More informationFeature Selection Based on Relative Attribute Dependency: An Experimental Study
Feature Selection Based on Relative Attribute Dependency: An Experimental Study Jianchao Han, Ricardo Sanchez, Xiaohua Hu, T.Y. Lin Department of Computer Science, California State University Dominguez
More informationWEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1
WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1 H. Altay Güvenir and Aynur Akkuş Department of Computer Engineering and Information Science Bilkent University, 06533, Ankara, Turkey
More informationSearching: Where it all begins...
Searching: Where it all begins... CPSC 433 Christian Jacob Dept. of Computer Science Dept. of Biochemistry & Molecular Biology University of Calgary Problem Solving by Searching 1. Introductory Concepts
More informationEnhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques
24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE
More informationUsing Text Learning to help Web browsing
Using Text Learning to help Web browsing Dunja Mladenić J.Stefan Institute, Ljubljana, Slovenia Carnegie Mellon University, Pittsburgh, PA, USA Dunja.Mladenic@{ijs.si, cs.cmu.edu} Abstract Web browsing
More informationDisjoint Support Decompositions
Chapter 4 Disjoint Support Decompositions We introduce now a new property of logic functions which will be useful to further improve the quality of parameterizations in symbolic simulation. In informal
More informationA Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set
A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set Renu Vashist School of Computer Science and Engineering Shri Mata Vaishno Devi University, Katra,
More informationLearning Graph Grammars
Learning Graph Grammars 1 Aly El Gamal ECE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Abstract Discovery of patterns in a given graph - in the form of repeated
More informationError Correcting Codes
Error Correcting Codes 2. The Hamming Codes Priti Shankar Priti Shankar is with the Department of Computer Science and Automation at the Indian Institute of Science, Bangalore. Her interests are in Theoretical
More informationSIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING
SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING TAE-WAN RYU AND CHRISTOPH F. EICK Department of Computer Science, University of Houston, Houston, Texas 77204-3475 {twryu, ceick}@cs.uh.edu
More informationChapter 12 Feature Selection
Chapter 12 Feature Selection Xiaogang Su Department of Statistics University of Central Florida - 1 - Outline Why Feature Selection? Categorization of Feature Selection Methods Filter Methods Wrapper Methods
More informationA Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search
A Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search Jianli Ding, Liyang Fu School of Computer Science and Technology Civil Aviation University of China
More informationWhat is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.
What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem
More informationClassification Using Unstructured Rules and Ant Colony Optimization
Classification Using Unstructured Rules and Ant Colony Optimization Negar Zakeri Nejad, Amir H. Bakhtiary, and Morteza Analoui Abstract In this paper a new method based on the algorithm is proposed to
More informationMulti-relational Decision Tree Induction
Multi-relational Decision Tree Induction Arno J. Knobbe 1,2, Arno Siebes 2, Daniël van der Wallen 1 1 Syllogic B.V., Hoefseweg 1, 3821 AE, Amersfoort, The Netherlands, {a.knobbe, d.van.der.wallen}@syllogic.com
More informationAn Improved Upper Bound for the Sum-free Subset Constant
1 2 3 47 6 23 11 Journal of Integer Sequences, Vol. 13 (2010), Article 10.8.3 An Improved Upper Bound for the Sum-free Subset Constant Mark Lewko Department of Mathematics University of Texas at Austin
More informationChapter 27 Introduction to Information Retrieval and Web Search
Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval
More informationInduction of Decision Trees
Induction of Decision Trees Blaž Zupan and Ivan Bratko magixfriuni-ljsi/predavanja/uisp An Example Data Set and Decision Tree # Attribute Class Outlook Company Sailboat Sail? 1 sunny big small yes 2 sunny
More informationFramework for Design of Dynamic Programming Algorithms
CSE 441T/541T Advanced Algorithms September 22, 2010 Framework for Design of Dynamic Programming Algorithms Dynamic programming algorithms for combinatorial optimization generalize the strategy we studied
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining
More informationA Logic Language of Granular Computing
A Logic Language of Granular Computing Yiyu Yao and Bing Zhou Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yyao, zhou200b}@cs.uregina.ca Abstract Granular
More informationCONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM
1 CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM John R. Koza Computer Science Department Stanford University Stanford, California 94305 USA E-MAIL: Koza@Sunburn.Stanford.Edu
More informationFeature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining
More informationData Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA
Obj ti Objectives Motivation: Why preprocess the Data? Data Preprocessing Techniques Data Cleaning Data Integration and Transformation Data Reduction Data Preprocessing Lecture 3/DMBI/IKI83403T/MTI/UI
More informationInternational Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X
Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,
More informationLearning Classification Rules for Multiple Target Attributes
Learning Classification Rules for Multiple Target Attributes Bernard Ženko and Sašo Džeroski Department of Knowledge Technologies, Jožef Stefan Institute Jamova cesta 39, SI-1000 Ljubljana, Slovenia, Bernard.Zenko@ijs.si,
More informationGraph Matching: Fast Candidate Elimination Using Machine Learning Techniques
Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques M. Lazarescu 1,2, H. Bunke 1, and S. Venkatesh 2 1 Computer Science Department, University of Bern, Switzerland 2 School of
More informationBRACE: A Paradigm For the Discretization of Continuously Valued Data
Proceedings of the Seventh Florida Artificial Intelligence Research Symposium, pp. 7-2, 994 BRACE: A Paradigm For the Discretization of Continuously Valued Data Dan Ventura Tony R. Martinez Computer Science
More informationJOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2002 Vol. 1, No. 2, July-August 2002 The Theory of Classification Part 2: The Scratch-Built
More informationConstructively Learning a Near-Minimal Neural Network Architecture
Constructively Learning a Near-Minimal Neural Network Architecture Justin Fletcher and Zoran ObradoviC Abetract- Rather than iteratively manually examining a variety of pre-specified architectures, a constructive
More informationKeywords: clustering algorithms, unsupervised learning, cluster validity
Volume 6, Issue 1, January 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Clustering Based
More informationEfficient Case Based Feature Construction
Efficient Case Based Feature Construction Ingo Mierswa and Michael Wurst Artificial Intelligence Unit,Department of Computer Science, University of Dortmund, Germany {mierswa, wurst}@ls8.cs.uni-dortmund.de
More informationDetermining Differences between Two Sets of Polygons
Determining Differences between Two Sets of Polygons MATEJ GOMBOŠI, BORUT ŽALIK Institute for Computer Science Faculty of Electrical Engineering and Computer Science, University of Maribor Smetanova 7,
More informationStructural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems
Structural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems James Montgomery No Institute Given Abstract. When using a constructive search algorithm, solutions to scheduling
More informationA Method for Construction of Orthogonal Arrays 1
Eighth International Workshop on Optimal Codes and Related Topics July 10-14, 2017, Sofia, Bulgaria pp. 49-54 A Method for Construction of Orthogonal Arrays 1 Iliya Bouyukliev iliyab@math.bas.bg Institute
More informationReality Mining Via Process Mining
Reality Mining Via Process Mining O. M. Hassan, M. S. Farag, and M. M. Mohie El-Din Abstract Reality mining project work on Ubiquitous Mobile Systems (UMSs) that allow for automated capturing of events.
More informationPETRI NET BASED SCHEDULING APPROACH COMBINING DISPATCHING RULES AND LOCAL SEARCH
PETRI NET BASED SCHEDULING APPROACH COMBINING DISPATCHING RULES AND LOCAL SEARCH Gašper Mušič (a) (a) University of Ljubljana Faculty of Electrical Engineering Tržaška 25, Ljubljana, Slovenia (a) gasper.music@fe.uni-lj.si
More informationOptimized Implementation of Logic Functions
June 25, 22 9:7 vra235_ch4 Sheet number Page number 49 black chapter 4 Optimized Implementation of Logic Functions 4. Nc3xe4, Nb8 d7 49 June 25, 22 9:7 vra235_ch4 Sheet number 2 Page number 5 black 5 CHAPTER
More informationBuilding a Concept Hierarchy from a Distance Matrix
Building a Concept Hierarchy from a Distance Matrix Huang-Cheng Kuo 1 and Jen-Peng Huang 2 1 Department of Computer Science and Information Engineering National Chiayi University, Taiwan 600 hckuo@mail.ncyu.edu.tw
More informationAn Improved Apriori Algorithm for Association Rules
Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan
More informatione-ccc-biclustering: Related work on biclustering algorithms for time series gene expression data
: Related work on biclustering algorithms for time series gene expression data Sara C. Madeira 1,2,3, Arlindo L. Oliveira 1,2 1 Knowledge Discovery and Bioinformatics (KDBIO) group, INESC-ID, Lisbon, Portugal
More informationREDUCING GRAPH COLORING TO CLIQUE SEARCH
Asia Pacific Journal of Mathematics, Vol. 3, No. 1 (2016), 64-85 ISSN 2357-2205 REDUCING GRAPH COLORING TO CLIQUE SEARCH SÁNDOR SZABÓ AND BOGDÁN ZAVÁLNIJ Institute of Mathematics and Informatics, University
More informationDiscretizing Continuous Attributes Using Information Theory
Discretizing Continuous Attributes Using Information Theory Chang-Hwan Lee Department of Information and Communications, DongGuk University, Seoul, Korea 100-715 chlee@dgu.ac.kr Abstract. Many classification
More informationDiscretization and Grouping: Preprocessing Steps for Data Mining
Discretization and Grouping: Preprocessing Steps for Data Mining Petr Berka 1 and Ivan Bruha 2 z Laboratory of Intelligent Systems Prague University of Economic W. Churchill Sq. 4, Prague CZ-13067, Czech
More informationOn Reduct Construction Algorithms
1 On Reduct Construction Algorithms Yiyu Yao 1, Yan Zhao 1 and Jue Wang 2 1 Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {yyao, yanzhao}@cs.uregina.ca 2 Laboratory
More informationFeature-weighted k-nearest Neighbor Classifier
Proceedings of the 27 IEEE Symposium on Foundations of Computational Intelligence (FOCI 27) Feature-weighted k-nearest Neighbor Classifier Diego P. Vivencio vivencio@comp.uf scar.br Estevam R. Hruschka
More informationFormal Concept Analysis and Hierarchical Classes Analysis
Formal Concept Analysis and Hierarchical Classes Analysis Yaohua Chen, Yiyu Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: {chen115y, yyao}@cs.uregina.ca
More informationVerbalizing Business Rules: Part 9
Verbalizing Business Rules: Part 9 Terry Halpin Northface University Business rules should be validated by business domain experts, and hence specified using concepts and languages easily understood by
More informationData Analysis and Mining in Ordered Information Tables
Data Analysis and Mining in Ordered Information Tables Ying Sai, Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Ning Zhong
More informationIntroduction to Relational Databases. Introduction to Relational Databases cont: Introduction to Relational Databases cont: Relational Data structure
Databases databases Terminology of relational model Properties of database relations. Relational Keys. Meaning of entity integrity and referential integrity. Purpose and advantages of views. The relational
More informationMultimodal Information Spaces for Content-based Image Retrieval
Research Proposal Multimodal Information Spaces for Content-based Image Retrieval Abstract Currently, image retrieval by content is a research problem of great interest in academia and the industry, due
More informationA Quadtree-based Progressive Lossless Compression Technique for Volumetric Data Sets
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 87-95 (008) Short Paper A Quadtree-based Progressive Lossless Compression Technique for Volumetric Data Sets GREGOR KLAJNSEK, BORUT ZALIK, FRANC NOVAK *
More informationProject Assignment 2 (due April 6 th, 2016, 4:00pm, in class hard-copy please)
Virginia Tech. Computer Science CS 4604 Introduction to DBMS Spring 2016, Prakash Project Assignment 2 (due April 6 th, 2016, 4:00pm, in class hard-copy please) Reminders: a. Out of 100 points. Contains
More informationSD-Map A Fast Algorithm for Exhaustive Subgroup Discovery
SD-Map A Fast Algorithm for Exhaustive Subgroup Discovery Martin Atzmueller and Frank Puppe University of Würzburg, 97074 Würzburg, Germany Department of Computer Science Phone: +49 931 888-6739, Fax:
More informationClassification with Diffuse or Incomplete Information
Classification with Diffuse or Incomplete Information AMAURY CABALLERO, KANG YEN Florida International University Abstract. In many different fields like finance, business, pattern recognition, communication
More informationFormalizing the PRODIGY Planning Algorithm
Formalizing the PRODIGY Planning Algorithm Eugene Fink eugene@cs.cmu.edu http://www.cs.cmu.edu/~eugene Manuela Veloso veloso@cs.cmu.edu http://www.cs.cmu.edu/~mmv Computer Science Department, Carnegie
More informationAdvances in Databases and Information Systems 1997
ELECTRONIC WORKSHOPS IN COMPUTING Series edited by Professor C.J. van Rijsbergen Rainer Manthey and Viacheslav Wolfengagen (Eds) Advances in Databases and Information Systems 1997 Proceedings of the First
More informationThe Relationship between Slices and Module Cohesion
The Relationship between Slices and Module Cohesion Linda M. Ott Jeffrey J. Thuss Department of Computer Science Michigan Technological University Houghton, MI 49931 Abstract High module cohesion is often
More informationGreedy Homework Problems
CS 1510 Greedy Homework Problems 1. (2 points) Consider the following problem: INPUT: A set S = {(x i, y i ) 1 i n} of intervals over the real line. OUTPUT: A maximum cardinality subset S of S such that
More informationRigidity, connectivity and graph decompositions
First Prev Next Last Rigidity, connectivity and graph decompositions Brigitte Servatius Herman Servatius Worcester Polytechnic Institute Page 1 of 100 First Prev Next Last Page 2 of 100 We say that a framework
More information19 Machine Learning in Lisp
19 Machine Learning in Lisp Chapter Objectives Chapter Contents ID3 algorithm and inducing decision trees from lists of examples. A basic Lisp implementation of ID3 Demonstration on a simple credit assessment
More informationA Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values
A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values Patrick G. Clark Department of Electrical Eng. and Computer Sci. University of Kansas Lawrence,
More informationSet Manipulation with Boolean Functional Vectors for Symbolic Reachability Analysis
Set Manipulation with Boolean Functional Vectors for Symbolic Reachability Analysis Amit Goel Department of ECE, Carnegie Mellon University, PA. 15213. USA. agoel@ece.cmu.edu Randal E. Bryant Computer
More informationarxiv: v1 [cs.dm] 21 Dec 2015
The Maximum Cardinality Cut Problem is Polynomial in Proper Interval Graphs Arman Boyacı 1, Tinaz Ekim 1, and Mordechai Shalom 1 Department of Industrial Engineering, Boğaziçi University, Istanbul, Turkey
More informationHybrid Feature Selection for Modeling Intrusion Detection Systems
Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,
More informationTheorem 2.9: nearest addition algorithm
There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used
More informationCHAPTER 4: CLUSTER ANALYSIS
CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis
More informationEmbedding Large Complete Binary Trees in Hypercubes with Load Balancing
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 35, 104 109 (1996) ARTICLE NO. 0073 Embedding Large Complete Binary Trees in Hypercubes with Load Balancing KEMAL EFE Center for Advanced Computer Studies,
More informationExtending the Representation Capabilities of Shape Grammars: A Parametric Matching Technique for Shapes Defined by Curved Lines
From: AAAI Technical Report SS-03-02. Compilation copyright 2003, AAAI (www.aaai.org). All rights reserved. Extending the Representation Capabilities of Shape Grammars: A Parametric Matching Technique
More informationHandling Missing Attribute Values in Preterm Birth Data Sets
Handling Missing Attribute Values in Preterm Birth Data Sets Jerzy W. Grzymala-Busse 1, Linda K. Goodwin 2, Witold J. Grzymala-Busse 3, and Xinqun Zheng 4 1 Department of Electrical Engineering and Computer
More informationDecision Trees Dr. G. Bharadwaja Kumar VIT Chennai
Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target
More informationLOW-DENSITY PARITY-CHECK (LDPC) codes [1] can
208 IEEE TRANSACTIONS ON MAGNETICS, VOL 42, NO 2, FEBRUARY 2006 Structured LDPC Codes for High-Density Recording: Large Girth and Low Error Floor J Lu and J M F Moura Department of Electrical and Computer
More informationA Co-Clustering approach for Sum-Product Network Structure Learning
Università degli Studi di Bari Dipartimento di Informatica LACAM Machine Learning Group A Co-Clustering approach for Sum-Product Network Antonio Vergari Nicola Di Mauro Floriana Esposito December 8, 2014
More informationInternational Journal of Scientific & Engineering Research, Volume 5, Issue 7, July ISSN
International Journal of Scientific & Engineering Research, Volume 5, Issue 7, July-2014 445 Clusters using High Dimensional Data for Feature Subset Algorithm V. Abhilash M. Tech Student Department of
More informationA 12-STEP SORTING NETWORK FOR 22 ELEMENTS
A 12-STEP SORTING NETWORK FOR 22 ELEMENTS SHERENAZ W. AL-HAJ BADDAR Department of Computer Science, Kent State University Kent, Ohio 44240, USA KENNETH E. BATCHER Department of Computer Science, Kent State
More informationA DH-parameter based condition for 3R orthogonal manipulators to have 4 distinct inverse kinematic solutions
Wenger P., Chablat D. et Baili M., A DH-parameter based condition for R orthogonal manipulators to have 4 distinct inverse kinematic solutions, Journal of Mechanical Design, Volume 17, pp. 150-155, Janvier
More informationAssociation Rules. Berlin Chen References:
Association Rules Berlin Chen 2005 References: 1. Data Mining: Concepts, Models, Methods and Algorithms, Chapter 8 2. Data Mining: Concepts and Techniques, Chapter 6 Association Rules: Basic Concepts A
More informationData Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation
Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization
More informationValidating Plans with Durative Actions via Integrating Boolean and Numerical Constraints
Validating Plans with Durative Actions via Integrating Boolean and Numerical Constraints Roman Barták Charles University in Prague, Faculty of Mathematics and Physics Institute for Theoretical Computer
More informationNetwork Simplification Through Oracle Learning
Brigham Young University BYU ScholarsArchive All Faculty Publications 2002-05-17 Network Simplification Through Oracle Learning Tony R. Martinez martinez@cs.byu.edu Joshua Menke See next page for additional
More information