Feature transformation by function decomposition

Size: px
Start display at page:

Download "Feature transformation by function decomposition"

Transcription

1 To appear in IEEE Expert Spec Issue on Feature Transformation and Subset Selection (H Liu, M Motoda, eds) Feature transformation by function decomposition Blaž Zupan, Marko Bohanec, Janez Demšar, Ivan Bratko While not explicitly intended for feature transformation, some methods for switching circuit design implicitly deal with this problem Given a tabulated Boolean function, these methods construct a circuit that implements that function In 1950s and 1960s, Ashenhurst [1] and Curtis [2] proposed a function decomposition method that develops a switching circuit by constructing a nested hierarchy of tabulated Boolean functions Both the hierarchy and the functions themselves are discovered by the decomposition method and are not given in advance This is especially important from the viewpoint of feature construction, since the outputs of such functions can be regarded as new features not present in the original problem description The basic principle of function decomposition is the following Let a tabulated function y = F (X) use a set of input features X = x 1,, x n The goal is to decompose this function into y = G(A, H(B)), where A and B are subsets of features in X such that A B = X G and H are tabulated functions that are determined by the decomposition and are not predefined Their joint complexity (determined by some complexity measure) should be lower than the complexity of F Such a decomposition also discovers a new feature c = H(B) Since the decomposition can be applied recursively on H and G, the result in general is a hierarchy of features For each feature in the hierarchy, there is a corresponding tabulated function (such as H(B)) that determines the dependency of that feature on its immediate descendants in the hierarchy Ashenhurst-Curtis decomposition was intended for switching circuit design of completely specified Boolean functions Recently, Wan and Perkowski [3] extended the decomposition to handle incompletely specified Boolean functions In the framework of feature extraction, Ross et al [4] used a set of simple Boolean functions to show the decomposition s capability to discover and extract useful features This article presents an approach to feature transfor- 1

2 mation that is based on the extension of the function decomposition method by Zupan et al [5] This method allows the decomposition to deal with functions that involve nominal (ie, not necessarily binary) features, and is implemented in a system called HINT(Hierarchy INduction Tool) 1 Single-step function decomposition The core of the decomposition algorithm is a single-step decomposition which decomposes a tabulated function y = F (X) into two possibly less complex tabulated functions G and H, so that y = G(A, c) and c = H(B) The resulting functions G and H have to be consistent with F For the purpose of decomposition, a set of features X is partitioned to two disjoint subsets A and B, referred to as a free and bound set, respectively For a given feature partition, single-step decomposition discovers a new feature c and a tabular representation of H and G For example, consider a function y = F (x 1, x 2, x 3 ) as given in Table 1 The input features x 1, x 2, and the output feature y can take the values lo, med, hi; input feature x 3 can take the values lo, hi x 1 x 2 x 3 y lo lo lo lo lo lo hi lo lo med lo lo lo med hi med lo hi lo lo lo hi hi hi med med lo med med hi lo med med hi hi hi hi lo lo hi hi hi lo hi Table 1: Tabulated function y = F (x 1, x 2, x 3 ) Suppose that we want to discover a description of a new feature c = H(x 2, x 3 ) For this purpose, we represent F by a partition matrix that uses the values of x 1 for row labels, and the combinations of values of x 2 and x 3 for column labels (Figure 1a) Partition matrix entries with no corresponding instance in the tabulated representation of F are denoted with - and treated as don t-care 2

3 x 2 lo lo med med hi hi x 1 x 3 lo hi lo hi lo hi lo lo lo lo med lo hi med - - med - med hi hi hi hi - c hi,hi hi,lo 1 1 lo,lo med,hi 2 1 lo,hi med,lo 1 (a) (b) Figure 1: Partition matrix (a) and the corresponding incompatibility graph (b) for the function from Table 1 The node labels of the incompatibility graph found by graph coloring are circled Each column in the partition matrix denotes the behavior of F for a specific combination of x 2 and x 3 Columns that have pairwise equal row entries or at least one row entry is a don t-care are called compatible The decomposition has to assign each column a label that corresponds to a value of c If two columns are compatible, they can be assigned the same label To preserve the consistency, different labels have to be used for incompatible columns The partition matrix column labels are found by coloring an incompatibility graph (Figure 1b) This has a distinct node for each column of the partition matrix Two nodes are connected if the corresponding columns of partition matrix are incompatible For instance, the nodes hi,hi and lo,hi are connected because their corresponding columns are incompatible due to the entries in the first row (hi lo) With optimal coloring of the incompatibility graph, the new feature c obtains a minimal set of values needed for consistent derivation of H and G from F The optimal coloring of the graph from Figure 1b requires three different colors, ie, three abstract values for c The tabulated functions G and H can then be derived straightforwardly from the labeled partition matrix The resulting decomposition is given in Figure 2 The following can be observed: The tabulated functions G and H are overall smaller than the original function F 3

4 x1 y x 1 c y lo 1 lo lo 2 med lo 3 hi med 1 med med 3 hi hi 1 hi c x 2 x 3 c lo lo 1 lo hi 1 med lo 1 med hi 2 hi lo 1 hi hi 3 x2 x3 Figure 2: Decomposition of tabulated function from Table 1 using a new feature c = H(x 2, x 3 ) By combining the features x 2 and x 3 we obtained a new feature c and the corresponding tabulated function H, which can be interpreted as c = MIN(x 2, x 3 ) Similarly, the function G that relates x 1 and c to y can be interpreted as y = MAX(x 1, c) The decomposition generalizes some undefined entries of the partition matrix For example, F (hi, lo, hi) is generalized to hi because the column lo,hi has the same label as columns lo,lo and hi,lo 2 Finding the best feature partition So far, we have assumed that a partition of the features to free and bound sets is given However, for each function F there are many possible partitions, each one yielding a different feature c and a different pair of functions G and H In feature transformation, it is important to identify partitions that lead to simple but useful new features Typically, we prefer features with a small number of values, and those that yield functions G and H of low complexity The simplest measure for finding good feature partitions is the number of values required for the new feature c That is, when decomposing F, a set of 4

5 candidate partitions is examined and the one that yields c with a smallest set of possible values is selected for decomposition An alternative measure for the selection of partitions can be based on the complexity of functions Let I(F ) denote a number of bits to encode some function F [6, 4] Then, the best partition is the one that minimizes the overall complexity of newly discovered functions H and G, ie, the partition with minimal I(H) + I(G) The complexity of G and H should be lower than that of F, and decomposition takes place only if I(H) + I(G) < I(F ) for the best partition The number of possible partitions increases exponentially with the number of input features To limit the complexity of search, the decomposition should examine only a subset of partitions In HINT, the partitions examined are only those with less than b features in the bound set, where b is a user-defined parameter usually set to 2 or 3 The partition matrix can be very large However, it is never explicitly represented in the program as HINT derives the incompatibility graph directly from the data [6] Partition matrix is thus a formal construct used for explanation or analysis of the decomposition algorithm The coloring of the incompatibility graph is another potential bottleneck HINT uses an efficient heuristic algorithm for graph coloring [6] 3 Redundancy discovery and removal A special characteristic of function decomposition is its capability to identify and remove redundant features and/or their values Namely, whenever a feature c = H(x i ) is discovered such that c uses only a single value (ie, H is a constant function), the feature x i is redundant and can be removed from the dataset Similarly, whenever a feature c = H(x i ) uses less values than x i, the original feature x i can be replaced by c Both these dataset transformations are particularly useful for data preprocessing, since they can reduce the size of the problem and increase the coverage of the problem space by the learning examples HINT removes redundant features in the increased order of their relevance: the less relevant features are checked first, and if any type of redundancy mentioned above is discovered, the dataset is accordingly transformed The features are estimated by a ReliefF measure of relevance [7] 5

6 MONK1 c x 5 MONK c x 5 x 1 x 2 c x 5 x x x 1 x 2 Figure 3: MONK1 feature hierachy as discovered by HINT 4 Discovering feature hierarchies With all the above ingredients, decomposition enables the discovery of feature hierarchies Given a dataset, it is first pre-processed to remove redundant features and their values Next, a single-step decomposition is used to decompose the pre-processed tabulated function F to G and H This process is recursively repeated on G and H until they can not be decomposed further, ie, their further decomposition would increase the overall complexity of resulting functions Let us illustrate this process with several examples The first example is a well-known machine learning problem MONK1 [8] The dataset uses six 2 to 4-valued input features (x 1 to x 6 ) and contains 124 (of 435 possible) distinct learning examples that partially define the target concept x 1 = x 2 OR x 5 = 1 HINT develops a feature hierarchy (Figure 3) that 1 correctly excludes irrelevant input features x 3, x 4, and x 6, 2 transforms x 5 to x 5 by mapping four values of x 5 to only two values of x 5, 6

7 3 includes a new feature c and its tabular representation for x 1 = x 2, and 4 relates c and x 5 with a tabular representation of the OR function In other words, the resulting hierarchy correctly represents the target concept Similarly as MONK1, the MONK2 learning problem [8] uses the same set of input features but defines the concept: x i = 1 for exactly two choices of i {1, 2,, 6} The dataset contains 172 (of 435 possible) distinct learning examples Although the discovered structure (Figure 4) does not directly correspond to the original concept definition, it correctly reformulates the target concept by introducing features that count the number of ones in their arguments Also note that all input features that use more than two values are replaced by new binary features Figure 4: The feature hierarchy discovered for MONK2 Each node gives a name of the feature and cardinality of its set of values For a real-world example, consider a HOUSING dataset from a management decision support system for allocating housing loans This system was developed for the Housing Fund of Slovenia and used since 1991 in 13 floats of loans with a total value of approximately 90 million ECU Basically, its task is to rank applicants into one of nine priority classes based on 12 input features The system uses a hierarchical decision model For this experiment, we wanted to assess the ability of decomposition to reconstruct the system s model given a random selection of classified cases 7

8 Figure 5: The feature structure discovered for housing loan allocation problem Each node gives a name of the feature and cardinality of its set of values CA [%] HINT 85 C45 80 p [%] Figure 6: Averaged classification accuracy of HINT and C45 for 10 experiments with learning sets consisting of a random sample of p percents of the housing problem space 8

9 First, it was observed that in most cases when HINT was given 8000 or more learning examples (4% coverage of problem space), it discovered the same feature hierarchy (Figure 5) When presented to a domain expert, the discovered features were found meaningful For instance, the feature c 4 represents the present housing status of the applicant Features c 2, c 6, and c 8 respectively specify the way of solving the housing problem, applicant s current status, and his/her social and health conditions Thus, the structure of the original model was successfully reconstructed This reconstruction takes about 25 minutes of processor time on a HP J210 Unix workstation The quality of the functions discovered for housing problem was then assessed by classification of the remaining examples that completely cover this domain but were not included in the learning set The generalization of HINT was assessed as a learning curve (Figure 6) and compared to that of the stateof-the-art learning tool C45 [9] that induces decision trees For this domain, HINT generalizes well as it achieves 100% classification accuracy by using learning sets that cover 3 or more percents of the problem space Housing problem domain can be viewed as a typical representative of practical domains in the field of decision making that HINT was tested on Similar tests and experimental conclusion were for instance obtained by re-discovery of hierarchical decision models for nursing school applications, job application evaluation, and performance evaluation of public enterprises [6] 5 Feature construction Function decomposition can also be used for feature construction This is defined as a process of augmenting the feature space by inferring or creating additional features We here show how the single-step decomposition can be used both for finding appropriate combinations of original features and using them to construct new features which are then added to the original dataset Again, the new features that use small number of values are preferred For example, consider again the dataset from Table 1 Suppose we want to augment it with a single new feature For this purpose, single-step decomposition examines all pairs of original input features and for each pair derives a new feature A new feature that has the fewest values is preferred Such feature is c = H(x 2, x 3 ), and is defined by the example set from Figure 2 Function H can now be used for each example from Table 1 to obtain the corresponding 9

10 x 1 x 2 x 3 c y lo lo lo 1 lo lo lo hi 1 lo lo med lo 1 lo lo med hi 2 med lo hi lo 1 lo lo hi hi 3 hi med med lo 1 med med hi lo 1 med med hi hi 3 hi hi lo lo 1 hi hi hi lo 1 hi Table 2: A dataset from Table 1 augmented with a new feature c = H(x 2, x 3 ) value for c; such augmented dataset is shown in Table 2 Obviously, an original dataset can be augmented with more than just a single new feature We propose a schema where, using the single-step decomposition, m new features are added to a dataset: a candidate set of combinations of the original features are examined, and corresponding m new features that have the fewest values are selected Ties are resolved arbitrarily Only original features are used to generate new features Note also that in the process no hierarchy of features is constructed The feature construction and corresponding augmentation of the original dataset results in a new dataset which may then be further used by some machine learning algorithm We evaluate this schema on the above-mentioned domains For new features, only those that depend only on two original input features are considered For MONK1 and MONK2 domains, the benefit of adding new features was assessed by a 10-fold cross validation For HOUSING, 10 experiments were performed with a learning set consisting of a random sample of 1% of the problem space, while the test set consisted of the remaining 99% Each learning set was first augmented with m new features as described above, and then given to C45 [9] which was used to induce the target concept from the augmented dataset The induced decision tree was then tested on a test set, which was also augmented by the same set of new features, ie, using their definition as derived from the learning set In all the cases, a considerable increase of classification accuracy (Figure 7) and considerable decrease of the sizes of induced decision trees (Figure 8) occur when new features are added In other words, adding new features improves 10

11 CA [%] MONK MONK m CA [%] HOUSING Figure 7: Classification accuracy of C45 as a function of number of new features m added to MONK1 and MONK2 (a) and housing loans dataset (b) m N MONK N 40 0 MONK1 m HOUSING m Figure 8: Number of nodes N of C45-induced decision trees as a function of number of new features m added to MONK1 and MONK2 (a) and housing loans dataset (b) 11

12 both the classification accuracy and the transparency of decision trees The results indicate that function decomposition discovers features that are relevant for these domains It is further interesting to note how well a simple feature selection criterion based on the number of new feature s values performs in terms of yielding augmented datasets that enable C45 to derive classifiers with high classification accuracies 6 Further work In the framework of feature transformation, function decomposition is a promising approach to discover and construct new features that are either added to the original dataset, or transform this dataset to a hierarchy of less complex datasets The function decomposition algorithm in HINT performs a kind of generalization, so it can also be viewed as a machine learning algorithm The approach described in this article is limited to consistent datasets and nominal features It is therefore desired to extend the approach to discover new features from noisy data, and from data that uses continuous features To handle noisy data, a minimal-error decomposition was recently proposed [6] It is based on a representation of learning examples with class distributions and uses successive column merging of partition matrix, so that the expected error of classification is minimized So far, the method was evaluated only as a machine learning tool, while further work is needed to assess its appropriateness for feature transformation problems References [1] R L Ashenhurst, The decomposition of switching functions, tech rep, Bell Laboratories BL-1(11), pages , 1952 [2] H A Curtis, A New Approach to the Design of Switching Functions Van Nostrand, Princeton, NJ, 1962 [3] W Wan and M A Perkowski, A new approach to the decomposition of incompletely specified functions based on graph-coloring and local transformations and its application to FPGA mapping, in Proc of the IEEE EURO-DAC 92, (Hamburg), pp ,

13 [4] T D Ross, M J Noviskey, D A Gadd, and J A Goldman, Pattern theoretic feature extraction and constructive induction, in Proc ML-COLT 94 Workshop on Constructive Induction and Change of Representation, (New Brunswick, New Jersey), July 1994 [5] B Zupan, M Bohanec, I Bratko, and J Demšar, Machine learning by function decomposition, in Proc Fourteenth International Conference on Machine Learning (J D H Fisher, ed), (San Mateo, CA), pp , Morgan Kaufmann, 1997 [6] B Zupan, Machine learning based on function decomposition PhD thesis, University of Ljubljana, Apr 1997 Available at [7] I Kononenko, Estimating attributes, in Proc of the European Conference on Machine Learning (ECML-94) (F Bergadano and L D Raedt, eds), vol 784 of Lecture Notes in Artificial Intelligence, (Catania), pp , Springer-Verlag, 1994 [8] S B Thrun et al, A performance comparison of different learning algorithms, tech rep, Carnegie Mellon University CMU-CS , 1991 [9] J R Quinlan, C45: Programs for Machine Learning Morgan Kaufmann Publishers,

14 Blaž Zupan is a research assistant at Jožef Stefan Institute, Ljubljana, Slovenia His research interest include machine learning, intelligent data analysis and knowledge discovery in databases, computer-aided support for medical decision making, and medical informatics He has his MS in computer science from University of Houston, Texas, and PhD from University of Ljubljana, Slovenia He can be reached at Department of Intelligent Systems, J Stefan Institute, Jamova 39, SI-1000 Ljubljana, Slovenia; blazzupan@ijssi; Marko Bohanec is a research associate at Jožef Stefan Institute, Ljubljana, Slovenia, and assistant professor in information systems at the Faculty of Organizational Sciences, University of Maribor His research and development interests are in decision support systems and machine learning He obtained his PhD in Computer Science at the Faculty of Electrical Engineering and Computer Science, University of Ljubljana He has published in journals such as Machine Learning, Acta Psychologica and Information & Management Contact him at Department of Intelligent Systems, J Stefan Institute, Jamova 39, SI-1000 Ljubljana, Slovenia; markobohanec@ijssi; Ivan Bratko is a professor of computer science at the Faculty of Computer and Information Science, University of Ljubljana, Slovenia He heads the AI laboratories at Jožef Stefan Institute and the University He is the chairman of ISSEK, International School for the Synthesis of Expert Knowledge He has conducted research in machine learning, knowledge-based systems, qualitative modeling, intelligent robotics, heuristic programming and computer chess His main interests in machine learning have been in learning from noisy data, combining learning and qualitative reasoning, and various applications of machine learning and Inductive Logic Programming, including medicine, ecological modeling and control of dynamic systems He is the author of widely adopted text PROLOG Programming for Artificial Intelligence (Addison-Wesley 1986, second edition 1990) He co-edited (with N Lavrač) Progress in Machine Learning (Sigma Press 1987) and co-authored (with I Mozetič and N Lavrač) KARDIO: a Study in Deep and Qualitative Knowledge for Expert Systems (MIT Press, 1989) Contact him at the Faculty of Computer and Information Science, University of Ljubljana, Tržaška 25, SI-1000 Ljubljana, Slovenia; ivanbratko@friuni-ljsi Janez Demšar is a research assistant at the Faculty of Computer and Information Science, University of Ljubljana, Slovenia He has BSc in computer science from the same faculty His research interest are machine learning and intelligent data analysis He can be reached at the Faculty of Computer and Information Science, University of Ljubljana, Tržaška 25, SI-1000 Ljubljana, Slovenia; janezdemsar@friuni-ljsi; 14

Learning by discovering concept hierarchies

Learning by discovering concept hierarchies Artificial Intelligence 109 (1999) 211 242 Learning by discovering concept hierarchies Blaž Zupan a,b,, Marko Bohanec b,1, Janez Demšar a,ivanbratko a,b a Faculty of Computer and Information Sciences,

More information

1 Introduction A decision table provides a simple means for concept representation It represents a dataset with labeled instances, each relating a set

1 Introduction A decision table provides a simple means for concept representation It represents a dataset with labeled instances, each relating a set Experimental evaluation of three partition selection criteria for decision table decomposition Blaz Zupan and Marko Bohanec Department of Intelligent Systems Jozef Stefan Institute, 1000 Ljubljana, Slovenia

More information

Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees

Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees Jing Wang Computer Science Department, The University of Iowa jing-wang-1@uiowa.edu W. Nick Street Management Sciences Department,

More information

Mining High Order Decision Rules

Mining High Order Decision Rules Mining High Order Decision Rules Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 e-mail: yyao@cs.uregina.ca Abstract. We introduce the notion of high

More information

Stochastic propositionalization of relational data using aggregates

Stochastic propositionalization of relational data using aggregates Stochastic propositionalization of relational data using aggregates Valentin Gjorgjioski and Sašo Dzeroski Jožef Stefan Institute Abstract. The fact that data is already stored in relational databases

More information

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of California, San Diego CA 92093{0114, USA Abstract. We

More information

Rough Set Approaches to Rule Induction from Incomplete Data

Rough Set Approaches to Rule Induction from Incomplete Data Proceedings of the IPMU'2004, the 10th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Perugia, Italy, July 4 9, 2004, vol. 2, 923 930 Rough

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

Constraint Based Induction of Multi-Objective Regression Trees

Constraint Based Induction of Multi-Objective Regression Trees Constraint Based Induction of Multi-Objective Regression Trees Jan Struyf 1 and Sašo Džeroski 2 1 Katholieke Universiteit Leuven, Dept. of Computer Science Celestijnenlaan 200A, B-3001 Leuven, Belgium

More information

Efficient SQL-Querying Method for Data Mining in Large Data Bases

Efficient SQL-Querying Method for Data Mining in Large Data Bases Efficient SQL-Querying Method for Data Mining in Large Data Bases Nguyen Hung Son Institute of Mathematics Warsaw University Banacha 2, 02095, Warsaw, Poland Abstract Data mining can be understood as a

More information

An Information-Theoretic Approach to the Prepruning of Classification Rules

An Information-Theoretic Approach to the Prepruning of Classification Rules An Information-Theoretic Approach to the Prepruning of Classification Rules Max Bramer University of Portsmouth, Portsmouth, UK Abstract: Keywords: The automatic induction of classification rules from

More information

Improving Classifier Performance by Imputing Missing Values using Discretization Method

Improving Classifier Performance by Imputing Missing Values using Discretization Method Improving Classifier Performance by Imputing Missing Values using Discretization Method E. CHANDRA BLESSIE Assistant Professor, Department of Computer Science, D.J.Academy for Managerial Excellence, Coimbatore,

More information

Fuzzy Partitioning with FID3.1

Fuzzy Partitioning with FID3.1 Fuzzy Partitioning with FID3.1 Cezary Z. Janikow Dept. of Mathematics and Computer Science University of Missouri St. Louis St. Louis, Missouri 63121 janikow@umsl.edu Maciej Fajfer Institute of Computing

More information

Trading Accuracy for Simplicity in Decision Trees

Trading Accuracy for Simplicity in Decision Trees Machine Learning, 15,223-250 (1994) 1994 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Trading Accuracy for Simplicity in Decision Trees MARKO BOHANEC "Joief Stefan" Institute, Jamova

More information

A Model of Machine Learning Based on User Preference of Attributes

A Model of Machine Learning Based on User Preference of Attributes 1 A Model of Machine Learning Based on User Preference of Attributes Yiyu Yao 1, Yan Zhao 1, Jue Wang 2 and Suqing Han 2 1 Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada

More information

Functional Decomposition of MVL Functions using Multi-Valued Decision Diagrams

Functional Decomposition of MVL Functions using Multi-Valued Decision Diagrams Functional ecomposition of MVL Functions using Multi-Valued ecision iagrams Craig Files Rolf rechsler Marek A. Perkowski epartment of Electrical Engineering Institute of Computer Science Portland State

More information

Forward Feature Selection Using Residual Mutual Information

Forward Feature Selection Using Residual Mutual Information Forward Feature Selection Using Residual Mutual Information Erik Schaffernicht, Christoph Möller, Klaus Debes and Horst-Michael Gross Ilmenau University of Technology - Neuroinformatics and Cognitive Robotics

More information

Filter methods for feature selection. A comparative study

Filter methods for feature selection. A comparative study Filter methods for feature selection. A comparative study Noelia Sánchez-Maroño, Amparo Alonso-Betanzos, and María Tombilla-Sanromán University of A Coruña, Department of Computer Science, 15071 A Coruña,

More information

method is described in detail in section 3, and eerimentally evaluated in section 4 on several domains of dierent comleity The aer is concluded by a s

method is described in detail in section 3, and eerimentally evaluated in section 4 on several domains of dierent comleity The aer is concluded by a s To aear in Proc ICML-97 Machine Learning by Function Decomosition Blaz Zuan Jozef Stefan Institute Ljubljana, Slovenia blazzuan@ijssi Marko Bohanec Jozef Stefan Institute Ljubljana, Slovenia markobohanec@ijssi

More information

Feature Selection Based on Relative Attribute Dependency: An Experimental Study

Feature Selection Based on Relative Attribute Dependency: An Experimental Study Feature Selection Based on Relative Attribute Dependency: An Experimental Study Jianchao Han, Ricardo Sanchez, Xiaohua Hu, T.Y. Lin Department of Computer Science, California State University Dominguez

More information

WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1

WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1 WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1 H. Altay Güvenir and Aynur Akkuş Department of Computer Engineering and Information Science Bilkent University, 06533, Ankara, Turkey

More information

Searching: Where it all begins...

Searching: Where it all begins... Searching: Where it all begins... CPSC 433 Christian Jacob Dept. of Computer Science Dept. of Biochemistry & Molecular Biology University of Calgary Problem Solving by Searching 1. Introductory Concepts

More information

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques 24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE

More information

Using Text Learning to help Web browsing

Using Text Learning to help Web browsing Using Text Learning to help Web browsing Dunja Mladenić J.Stefan Institute, Ljubljana, Slovenia Carnegie Mellon University, Pittsburgh, PA, USA Dunja.Mladenic@{ijs.si, cs.cmu.edu} Abstract Web browsing

More information

Disjoint Support Decompositions

Disjoint Support Decompositions Chapter 4 Disjoint Support Decompositions We introduce now a new property of logic functions which will be useful to further improve the quality of parameterizations in symbolic simulation. In informal

More information

A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set

A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set Renu Vashist School of Computer Science and Engineering Shri Mata Vaishno Devi University, Katra,

More information

Learning Graph Grammars

Learning Graph Grammars Learning Graph Grammars 1 Aly El Gamal ECE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Abstract Discovery of patterns in a given graph - in the form of repeated

More information

Error Correcting Codes

Error Correcting Codes Error Correcting Codes 2. The Hamming Codes Priti Shankar Priti Shankar is with the Department of Computer Science and Automation at the Indian Institute of Science, Bangalore. Her interests are in Theoretical

More information

SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING

SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING TAE-WAN RYU AND CHRISTOPH F. EICK Department of Computer Science, University of Houston, Houston, Texas 77204-3475 {twryu, ceick}@cs.uh.edu

More information

Chapter 12 Feature Selection

Chapter 12 Feature Selection Chapter 12 Feature Selection Xiaogang Su Department of Statistics University of Central Florida - 1 - Outline Why Feature Selection? Categorization of Feature Selection Methods Filter Methods Wrapper Methods

More information

A Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search

A Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search A Hybrid Feature Selection Algorithm Based on Information Gain and Sequential Forward Floating Search Jianli Ding, Liyang Fu School of Computer Science and Technology Civil Aviation University of China

More information

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control. What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem

More information

Classification Using Unstructured Rules and Ant Colony Optimization

Classification Using Unstructured Rules and Ant Colony Optimization Classification Using Unstructured Rules and Ant Colony Optimization Negar Zakeri Nejad, Amir H. Bakhtiary, and Morteza Analoui Abstract In this paper a new method based on the algorithm is proposed to

More information

Multi-relational Decision Tree Induction

Multi-relational Decision Tree Induction Multi-relational Decision Tree Induction Arno J. Knobbe 1,2, Arno Siebes 2, Daniël van der Wallen 1 1 Syllogic B.V., Hoefseweg 1, 3821 AE, Amersfoort, The Netherlands, {a.knobbe, d.van.der.wallen}@syllogic.com

More information

An Improved Upper Bound for the Sum-free Subset Constant

An Improved Upper Bound for the Sum-free Subset Constant 1 2 3 47 6 23 11 Journal of Integer Sequences, Vol. 13 (2010), Article 10.8.3 An Improved Upper Bound for the Sum-free Subset Constant Mark Lewko Department of Mathematics University of Texas at Austin

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

Induction of Decision Trees

Induction of Decision Trees Induction of Decision Trees Blaž Zupan and Ivan Bratko magixfriuni-ljsi/predavanja/uisp An Example Data Set and Decision Tree # Attribute Class Outlook Company Sailboat Sail? 1 sunny big small yes 2 sunny

More information

Framework for Design of Dynamic Programming Algorithms

Framework for Design of Dynamic Programming Algorithms CSE 441T/541T Advanced Algorithms September 22, 2010 Framework for Design of Dynamic Programming Algorithms Dynamic programming algorithms for combinatorial optimization generalize the strategy we studied

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining

More information

A Logic Language of Granular Computing

A Logic Language of Granular Computing A Logic Language of Granular Computing Yiyu Yao and Bing Zhou Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yyao, zhou200b}@cs.uregina.ca Abstract Granular

More information

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM

CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM 1 CONCEPT FORMATION AND DECISION TREE INDUCTION USING THE GENETIC PROGRAMMING PARADIGM John R. Koza Computer Science Department Stanford University Stanford, California 94305 USA E-MAIL: Koza@Sunburn.Stanford.Edu

More information

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining

More information

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA Obj ti Objectives Motivation: Why preprocess the Data? Data Preprocessing Techniques Data Cleaning Data Integration and Transformation Data Reduction Data Preprocessing Lecture 3/DMBI/IKI83403T/MTI/UI

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

Learning Classification Rules for Multiple Target Attributes

Learning Classification Rules for Multiple Target Attributes Learning Classification Rules for Multiple Target Attributes Bernard Ženko and Sašo Džeroski Department of Knowledge Technologies, Jožef Stefan Institute Jamova cesta 39, SI-1000 Ljubljana, Slovenia, Bernard.Zenko@ijs.si,

More information

Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques

Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques M. Lazarescu 1,2, H. Bunke 1, and S. Venkatesh 2 1 Computer Science Department, University of Bern, Switzerland 2 School of

More information

BRACE: A Paradigm For the Discretization of Continuously Valued Data

BRACE: A Paradigm For the Discretization of Continuously Valued Data Proceedings of the Seventh Florida Artificial Intelligence Research Symposium, pp. 7-2, 994 BRACE: A Paradigm For the Discretization of Continuously Valued Data Dan Ventura Tony R. Martinez Computer Science

More information

JOURNAL OF OBJECT TECHNOLOGY

JOURNAL OF OBJECT TECHNOLOGY JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2002 Vol. 1, No. 2, July-August 2002 The Theory of Classification Part 2: The Scratch-Built

More information

Constructively Learning a Near-Minimal Neural Network Architecture

Constructively Learning a Near-Minimal Neural Network Architecture Constructively Learning a Near-Minimal Neural Network Architecture Justin Fletcher and Zoran ObradoviC Abetract- Rather than iteratively manually examining a variety of pre-specified architectures, a constructive

More information

Keywords: clustering algorithms, unsupervised learning, cluster validity

Keywords: clustering algorithms, unsupervised learning, cluster validity Volume 6, Issue 1, January 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Clustering Based

More information

Efficient Case Based Feature Construction

Efficient Case Based Feature Construction Efficient Case Based Feature Construction Ingo Mierswa and Michael Wurst Artificial Intelligence Unit,Department of Computer Science, University of Dortmund, Germany {mierswa, wurst}@ls8.cs.uni-dortmund.de

More information

Determining Differences between Two Sets of Polygons

Determining Differences between Two Sets of Polygons Determining Differences between Two Sets of Polygons MATEJ GOMBOŠI, BORUT ŽALIK Institute for Computer Science Faculty of Electrical Engineering and Computer Science, University of Maribor Smetanova 7,

More information

Structural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems

Structural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems Structural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems James Montgomery No Institute Given Abstract. When using a constructive search algorithm, solutions to scheduling

More information

A Method for Construction of Orthogonal Arrays 1

A Method for Construction of Orthogonal Arrays 1 Eighth International Workshop on Optimal Codes and Related Topics July 10-14, 2017, Sofia, Bulgaria pp. 49-54 A Method for Construction of Orthogonal Arrays 1 Iliya Bouyukliev iliyab@math.bas.bg Institute

More information

Reality Mining Via Process Mining

Reality Mining Via Process Mining Reality Mining Via Process Mining O. M. Hassan, M. S. Farag, and M. M. Mohie El-Din Abstract Reality mining project work on Ubiquitous Mobile Systems (UMSs) that allow for automated capturing of events.

More information

PETRI NET BASED SCHEDULING APPROACH COMBINING DISPATCHING RULES AND LOCAL SEARCH

PETRI NET BASED SCHEDULING APPROACH COMBINING DISPATCHING RULES AND LOCAL SEARCH PETRI NET BASED SCHEDULING APPROACH COMBINING DISPATCHING RULES AND LOCAL SEARCH Gašper Mušič (a) (a) University of Ljubljana Faculty of Electrical Engineering Tržaška 25, Ljubljana, Slovenia (a) gasper.music@fe.uni-lj.si

More information

Optimized Implementation of Logic Functions

Optimized Implementation of Logic Functions June 25, 22 9:7 vra235_ch4 Sheet number Page number 49 black chapter 4 Optimized Implementation of Logic Functions 4. Nc3xe4, Nb8 d7 49 June 25, 22 9:7 vra235_ch4 Sheet number 2 Page number 5 black 5 CHAPTER

More information

Building a Concept Hierarchy from a Distance Matrix

Building a Concept Hierarchy from a Distance Matrix Building a Concept Hierarchy from a Distance Matrix Huang-Cheng Kuo 1 and Jen-Peng Huang 2 1 Department of Computer Science and Information Engineering National Chiayi University, Taiwan 600 hckuo@mail.ncyu.edu.tw

More information

An Improved Apriori Algorithm for Association Rules

An Improved Apriori Algorithm for Association Rules Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan

More information

e-ccc-biclustering: Related work on biclustering algorithms for time series gene expression data

e-ccc-biclustering: Related work on biclustering algorithms for time series gene expression data : Related work on biclustering algorithms for time series gene expression data Sara C. Madeira 1,2,3, Arlindo L. Oliveira 1,2 1 Knowledge Discovery and Bioinformatics (KDBIO) group, INESC-ID, Lisbon, Portugal

More information

REDUCING GRAPH COLORING TO CLIQUE SEARCH

REDUCING GRAPH COLORING TO CLIQUE SEARCH Asia Pacific Journal of Mathematics, Vol. 3, No. 1 (2016), 64-85 ISSN 2357-2205 REDUCING GRAPH COLORING TO CLIQUE SEARCH SÁNDOR SZABÓ AND BOGDÁN ZAVÁLNIJ Institute of Mathematics and Informatics, University

More information

Discretizing Continuous Attributes Using Information Theory

Discretizing Continuous Attributes Using Information Theory Discretizing Continuous Attributes Using Information Theory Chang-Hwan Lee Department of Information and Communications, DongGuk University, Seoul, Korea 100-715 chlee@dgu.ac.kr Abstract. Many classification

More information

Discretization and Grouping: Preprocessing Steps for Data Mining

Discretization and Grouping: Preprocessing Steps for Data Mining Discretization and Grouping: Preprocessing Steps for Data Mining Petr Berka 1 and Ivan Bruha 2 z Laboratory of Intelligent Systems Prague University of Economic W. Churchill Sq. 4, Prague CZ-13067, Czech

More information

On Reduct Construction Algorithms

On Reduct Construction Algorithms 1 On Reduct Construction Algorithms Yiyu Yao 1, Yan Zhao 1 and Jue Wang 2 1 Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {yyao, yanzhao}@cs.uregina.ca 2 Laboratory

More information

Feature-weighted k-nearest Neighbor Classifier

Feature-weighted k-nearest Neighbor Classifier Proceedings of the 27 IEEE Symposium on Foundations of Computational Intelligence (FOCI 27) Feature-weighted k-nearest Neighbor Classifier Diego P. Vivencio vivencio@comp.uf scar.br Estevam R. Hruschka

More information

Formal Concept Analysis and Hierarchical Classes Analysis

Formal Concept Analysis and Hierarchical Classes Analysis Formal Concept Analysis and Hierarchical Classes Analysis Yaohua Chen, Yiyu Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: {chen115y, yyao}@cs.uregina.ca

More information

Verbalizing Business Rules: Part 9

Verbalizing Business Rules: Part 9 Verbalizing Business Rules: Part 9 Terry Halpin Northface University Business rules should be validated by business domain experts, and hence specified using concepts and languages easily understood by

More information

Data Analysis and Mining in Ordered Information Tables

Data Analysis and Mining in Ordered Information Tables Data Analysis and Mining in Ordered Information Tables Ying Sai, Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Ning Zhong

More information

Introduction to Relational Databases. Introduction to Relational Databases cont: Introduction to Relational Databases cont: Relational Data structure

Introduction to Relational Databases. Introduction to Relational Databases cont: Introduction to Relational Databases cont: Relational Data structure Databases databases Terminology of relational model Properties of database relations. Relational Keys. Meaning of entity integrity and referential integrity. Purpose and advantages of views. The relational

More information

Multimodal Information Spaces for Content-based Image Retrieval

Multimodal Information Spaces for Content-based Image Retrieval Research Proposal Multimodal Information Spaces for Content-based Image Retrieval Abstract Currently, image retrieval by content is a research problem of great interest in academia and the industry, due

More information

A Quadtree-based Progressive Lossless Compression Technique for Volumetric Data Sets

A Quadtree-based Progressive Lossless Compression Technique for Volumetric Data Sets JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 87-95 (008) Short Paper A Quadtree-based Progressive Lossless Compression Technique for Volumetric Data Sets GREGOR KLAJNSEK, BORUT ZALIK, FRANC NOVAK *

More information

Project Assignment 2 (due April 6 th, 2016, 4:00pm, in class hard-copy please)

Project Assignment 2 (due April 6 th, 2016, 4:00pm, in class hard-copy please) Virginia Tech. Computer Science CS 4604 Introduction to DBMS Spring 2016, Prakash Project Assignment 2 (due April 6 th, 2016, 4:00pm, in class hard-copy please) Reminders: a. Out of 100 points. Contains

More information

SD-Map A Fast Algorithm for Exhaustive Subgroup Discovery

SD-Map A Fast Algorithm for Exhaustive Subgroup Discovery SD-Map A Fast Algorithm for Exhaustive Subgroup Discovery Martin Atzmueller and Frank Puppe University of Würzburg, 97074 Würzburg, Germany Department of Computer Science Phone: +49 931 888-6739, Fax:

More information

Classification with Diffuse or Incomplete Information

Classification with Diffuse or Incomplete Information Classification with Diffuse or Incomplete Information AMAURY CABALLERO, KANG YEN Florida International University Abstract. In many different fields like finance, business, pattern recognition, communication

More information

Formalizing the PRODIGY Planning Algorithm

Formalizing the PRODIGY Planning Algorithm Formalizing the PRODIGY Planning Algorithm Eugene Fink eugene@cs.cmu.edu http://www.cs.cmu.edu/~eugene Manuela Veloso veloso@cs.cmu.edu http://www.cs.cmu.edu/~mmv Computer Science Department, Carnegie

More information

Advances in Databases and Information Systems 1997

Advances in Databases and Information Systems 1997 ELECTRONIC WORKSHOPS IN COMPUTING Series edited by Professor C.J. van Rijsbergen Rainer Manthey and Viacheslav Wolfengagen (Eds) Advances in Databases and Information Systems 1997 Proceedings of the First

More information

The Relationship between Slices and Module Cohesion

The Relationship between Slices and Module Cohesion The Relationship between Slices and Module Cohesion Linda M. Ott Jeffrey J. Thuss Department of Computer Science Michigan Technological University Houghton, MI 49931 Abstract High module cohesion is often

More information

Greedy Homework Problems

Greedy Homework Problems CS 1510 Greedy Homework Problems 1. (2 points) Consider the following problem: INPUT: A set S = {(x i, y i ) 1 i n} of intervals over the real line. OUTPUT: A maximum cardinality subset S of S such that

More information

Rigidity, connectivity and graph decompositions

Rigidity, connectivity and graph decompositions First Prev Next Last Rigidity, connectivity and graph decompositions Brigitte Servatius Herman Servatius Worcester Polytechnic Institute Page 1 of 100 First Prev Next Last Page 2 of 100 We say that a framework

More information

19 Machine Learning in Lisp

19 Machine Learning in Lisp 19 Machine Learning in Lisp Chapter Objectives Chapter Contents ID3 algorithm and inducing decision trees from lists of examples. A basic Lisp implementation of ID3 Demonstration on a simple credit assessment

More information

A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values

A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values Patrick G. Clark Department of Electrical Eng. and Computer Sci. University of Kansas Lawrence,

More information

Set Manipulation with Boolean Functional Vectors for Symbolic Reachability Analysis

Set Manipulation with Boolean Functional Vectors for Symbolic Reachability Analysis Set Manipulation with Boolean Functional Vectors for Symbolic Reachability Analysis Amit Goel Department of ECE, Carnegie Mellon University, PA. 15213. USA. agoel@ece.cmu.edu Randal E. Bryant Computer

More information

arxiv: v1 [cs.dm] 21 Dec 2015

arxiv: v1 [cs.dm] 21 Dec 2015 The Maximum Cardinality Cut Problem is Polynomial in Proper Interval Graphs Arman Boyacı 1, Tinaz Ekim 1, and Mordechai Shalom 1 Department of Industrial Engineering, Boğaziçi University, Istanbul, Turkey

More information

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Hybrid Feature Selection for Modeling Intrusion Detection Systems Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

CHAPTER 4: CLUSTER ANALYSIS

CHAPTER 4: CLUSTER ANALYSIS CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis

More information

Embedding Large Complete Binary Trees in Hypercubes with Load Balancing

Embedding Large Complete Binary Trees in Hypercubes with Load Balancing JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING 35, 104 109 (1996) ARTICLE NO. 0073 Embedding Large Complete Binary Trees in Hypercubes with Load Balancing KEMAL EFE Center for Advanced Computer Studies,

More information

Extending the Representation Capabilities of Shape Grammars: A Parametric Matching Technique for Shapes Defined by Curved Lines

Extending the Representation Capabilities of Shape Grammars: A Parametric Matching Technique for Shapes Defined by Curved Lines From: AAAI Technical Report SS-03-02. Compilation copyright 2003, AAAI (www.aaai.org). All rights reserved. Extending the Representation Capabilities of Shape Grammars: A Parametric Matching Technique

More information

Handling Missing Attribute Values in Preterm Birth Data Sets

Handling Missing Attribute Values in Preterm Birth Data Sets Handling Missing Attribute Values in Preterm Birth Data Sets Jerzy W. Grzymala-Busse 1, Linda K. Goodwin 2, Witold J. Grzymala-Busse 3, and Xinqun Zheng 4 1 Department of Electrical Engineering and Computer

More information

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target

More information

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can 208 IEEE TRANSACTIONS ON MAGNETICS, VOL 42, NO 2, FEBRUARY 2006 Structured LDPC Codes for High-Density Recording: Large Girth and Low Error Floor J Lu and J M F Moura Department of Electrical and Computer

More information

A Co-Clustering approach for Sum-Product Network Structure Learning

A Co-Clustering approach for Sum-Product Network Structure Learning Università degli Studi di Bari Dipartimento di Informatica LACAM Machine Learning Group A Co-Clustering approach for Sum-Product Network Antonio Vergari Nicola Di Mauro Floriana Esposito December 8, 2014

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 7, July ISSN

International Journal of Scientific & Engineering Research, Volume 5, Issue 7, July ISSN International Journal of Scientific & Engineering Research, Volume 5, Issue 7, July-2014 445 Clusters using High Dimensional Data for Feature Subset Algorithm V. Abhilash M. Tech Student Department of

More information

A 12-STEP SORTING NETWORK FOR 22 ELEMENTS

A 12-STEP SORTING NETWORK FOR 22 ELEMENTS A 12-STEP SORTING NETWORK FOR 22 ELEMENTS SHERENAZ W. AL-HAJ BADDAR Department of Computer Science, Kent State University Kent, Ohio 44240, USA KENNETH E. BATCHER Department of Computer Science, Kent State

More information

A DH-parameter based condition for 3R orthogonal manipulators to have 4 distinct inverse kinematic solutions

A DH-parameter based condition for 3R orthogonal manipulators to have 4 distinct inverse kinematic solutions Wenger P., Chablat D. et Baili M., A DH-parameter based condition for R orthogonal manipulators to have 4 distinct inverse kinematic solutions, Journal of Mechanical Design, Volume 17, pp. 150-155, Janvier

More information

Association Rules. Berlin Chen References:

Association Rules. Berlin Chen References: Association Rules Berlin Chen 2005 References: 1. Data Mining: Concepts, Models, Methods and Algorithms, Chapter 8 2. Data Mining: Concepts and Techniques, Chapter 6 Association Rules: Basic Concepts A

More information

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization

More information

Validating Plans with Durative Actions via Integrating Boolean and Numerical Constraints

Validating Plans with Durative Actions via Integrating Boolean and Numerical Constraints Validating Plans with Durative Actions via Integrating Boolean and Numerical Constraints Roman Barták Charles University in Prague, Faculty of Mathematics and Physics Institute for Theoretical Computer

More information

Network Simplification Through Oracle Learning

Network Simplification Through Oracle Learning Brigham Young University BYU ScholarsArchive All Faculty Publications 2002-05-17 Network Simplification Through Oracle Learning Tony R. Martinez martinez@cs.byu.edu Joshua Menke See next page for additional

More information