Graph Matching and Learning in Pattern Recognition in the last ten years

Size: px
Start display at page:

Download "Graph Matching and Learning in Pattern Recognition in the last ten years"

Transcription

1 International Journal of Pattern Recognition and Artificial Intelligence c World Scientific Publishing Company Graph Matching and Learning in Pattern Recognition in the last ten years PASQUALE FOGGIA, GENNARO PERCANNELLA and MARIO VENTO Department of Information Engineering, Electrical Engineering and Applied Mathematics, University of Salerno Via Giovanni Paolo II, 132, Fisciano (SA), Italy {pfoggia,pergen,mvento}@unisa.it In this paper we examine the main advances registered in the last ten years in Pattern Recognition methodologies based on graph matching and related techniques, analyzing more than 180 papers; the aim is to provide a systematic framework presenting the recent history and the current developments. This is made by introducing a categorization of graph based techniques and reporting, for each class, the main contributions and the most outstanding research results. Keywords: Structural Pattern Recognition; Graph Matching; Graph Kernels; Graph Embeddings; Graph Learning; Graph Clustering; Graph and tree search strategies. 1. Introduction Structural Pattern Recognition bases its theoretical foundations on the decomposition of objects in terms of their constituent parts (subpatterns) and of the relations among them. Graphs, usually enriched with node and edge attributes, are the elective data structures for supporting this kind of representations. Some of the methods working on graphs introduce some restrictions on the structure of the graphs (e.g. only allowing planar graphs) or on the kind of attributes (e.g. some methods only allow single real-valued attributes for the graph edges). The use of a graph-based pattern representation induces the need to formulate the main activities required for Pattern Recognition in terms of operations on graphs: classification, usually intended as the comparison between an object and a set of prototypes, and learning, which is the process for obtaining a model of a class starting from a set of known samples, are among the key issues that must be addressed using graph-based techniques. The use of graphs in Pattern Recognition dates back to the early seventies, and the paper Thirty years of graph matching in Pattern Recognition 26 reports a survey of the literature on graph-based techniques since the first years and up to the early 2000 s. We have surely assisted to a maturation of the classical techniques for graph comparison, either exact or inexact; at the same time we are assisting to a rapid growth of many alternative approaches, such as graph embedding and graph 1

2 2 P.Foggia, G.Percannella and M.Vento Graph techniques in Pattern Recognition Graph matching Exact matching Inexact matching Graph embedding Isometric embeddings Spectral embeddings Prototype-based embeddings Graph kernels Graph clustering Clustering of graphs Graph-based clustering Graph learning Learning of graphs Graph-based learning Misc. problems Fig. 1. A graphical representation of the adopted categorization of the considered graph-based techniques. The techniques in the figure have been chosen because they either involve some kind of graph comparison, or use a graph-based approach to group objects into classes. kernels, aimed at making possible the application to graphs of vector-based techniques for classification and learning (such as the ones derived from the statistical classification and learning theory). In this paper we discuss the main advances registered in graph-based methodologies in the last ten years, analyzing more than 180 papers on this topic; the aim is to provide a systematic framework presenting the recent history of graphs in Pattern Recognition and the current trends. Our analysis starts from the above mentioned survey 26 and completes its contents by considering a selection of the most recent main contributions; consequently, the present paper, for the sake of conciseness, reports only references to works published during the last ten years. The reader is kindly invited to consult Ref. 26 for recovering the previous related works. However, the taxonomy of the papers presented in Ref. 26 has been extended with other graph-based problems that are related to matching, either because they involve some form of graph comparison, or because they use a graph-based approach to group patterns into classes. Figure 1 shows a graphical representation of the taxonomy adopted in this paper. In fact, in the last decade we have assisted to the birth and growth of methods facing learning and classification in a rather innovative scientific vision: the computational burden of matching algorithms together with their intrinsic complexity, in opposition to the well established world of statistical Pattern Recognition methodologies, suggested new paradigms for the graph-based methods. Why don t we try to reduce graph matching and learning to vector-based operations, so as to make it possible the use of statistical approaches? Two opposite ways of facing the problem, each with its pros and cons: graphs from the beginning to the end, with a few heavy algorithms, but the exploitation of all the information contained in the graphs; on the other side, the risk of loosing

3 Graphs in Pattern Recognition in the last ten years 3 discriminating power during the conversion of graphs into vectors (by selecting suitable properties), counterbalanced by the immediate access to all the theoretically assessed achievements of the statistical framework. In a sense, there are some traditional tools that can be considered to be halfway between these two approaches: an example is Graph Edit Distance, that is based on a matching between the nodes and the edges of the two graphs, but produces a distance information that can be used to cast the graphs into a metric space. However, Graph Edit Distance can still be considered an approach of the first kind, since in the computation of the metric, the information attached to the subparts can be considered in a context-dependent way, and has not to be reduced a priori to a vectorial form. These two opposite factions are now simultaneously active, each hoping to overcome the other; ten years ago these innovative methods were in the background, but now they are gaining more and more attention in the scientific literature on graphs. This is the reason why the categorization reported in this paper has been further expanded by including a new section describing a variety of novel approaches, such as graph embedding, graph kernels, graph clustering and graph learning, dedicating a subsection to each of them. It is worth pointing out that these methods were of course already known at the time of Ref. 26, but their diffusion and scientific interest has shown a significant growth in the last decade. For instance, a recent survey by Hancock and Wilson 71 compares and contrasts the work on graph based techniques by the Bern group led by Horst Bunke and the York group led by Edwin Hancock. The first group has historically put more emphasis on the purely structural aspects of graph-based techniques, while the second has focused on the extensions to the graph domain of probabilistic and information theoretic methodologies; however, both the schools in the last decade have found a point of convergence in the adoption of graph kernels and graph embedding techniques. Another recent paper by Livi and Rizzi 98 presents a survey of graph matching techniques. However, despite its title, it is mostly dedicated to graph embeddings and graph kernels, and does not aim to cover comprehensively the graph matching techniques; furthermore the paper is less specifically devoted to approaches used within the Pattern Recognition community. The overall organization of our paper is based on a categorization of the approaches with respect to the problem formulation they adopt, and secondarily to the kind of technique used to face the problem, following the taxonomy reported in Figure 1. We have distinguished between graph matching problems, that will be presented in Section 2, and other problems related to graph comparison, that are discussed in Section 3. In particular, the section on graph matching is divided between exact and inexact matching techniques. The section on other problems is articulated according to the techniques that have obtained most attention in recent literature, namely graph embedding, graph kernels, graph clustering and graph learning with a miscellaneous problems subsection for less common but related problems.

4 4 P.Foggia, G.Percannella and M.Vento For reasons of space, in this survey we have focused on the algorithms and not on their applications. The interested reader may find some complementary surveys on the applications of graph matching to Computer Vision and Pattern Recogniton, in Ref. 28, 53. For the very same reasons, we have not included research papers from outside of the Pattern Recognition community. Graph-based methods are used and investigated in many other research fields; among them, we can mention, with no pretense at completeness, Data Mining, Machine Learning, Complex Networks Analysis and Bioinformatics. 2. Graph Matching We recall briefly the terminology used in our previous survey 26. Exact graph matching is the search for a mapping between the nodes of two graphs which is edgepreserving, in the sense that if two nodes in the first graph are linked by an edge, the corresponding nodes in the second graph must have an edge, too. Several variants of exact matching exist (e.g. isomorphism, subgraph isomorphism, monomorphism, homomorphism, maximum common subgraph) depending on whether this constraint must hold in both directions of the mapping or not, if the mapping must be injective and if the mapping must be surjective. More formally, given two graphs G 1 = (V 1, E 1 ) and G 2 = (V 2, E 2 ) (where V and E are the sets of nodes and edges respectively), a mapping is a function µ : V 1 V 2. A mapping µ is edge preserving iff: v, w V 1, (v, w) E 1 = (µ(v), µ(w)) E 2 µ(v) = µ(w) (1) An edge preserving mapping is also called a homomorphism. A monomorphism, also called an edge-induced subgraph isomorphism, is a homomorphism µ that is also injective: v w V 1, µ(v) µ(w) (2) A graph isomorphism is a monomorphism µ that is bijective, and whose inverse mapping µ 1 is also a monomorphism: { v2 V 2, v 1 = µ 1 (v 2 ) V 1 : v 2 = µ(v 1 ) µ 1 (3) is a monomorphism A mapping µ is a subgraph isomorphism, that some authors call a node-induced subgraph isomorphism, if there is a (node-induced) subgraph of G 2 of G 2 such that µ is an isomorphism between G 1 and G 2. More formally: V 2 V 2 = {v 2 V 2 : v 1 V 1 : v 2 = µ(v 1 )} E 2 E 2 = E 2 (V 2 V 2) (4) µ is an isomorphism between G 1 and G 2 = (V 2, E 2) Finally, the maximum common subgraph problem is the search of the largest subgraph of G 1 that is isomorphic to a subgraph of G 2 (and usually, of the corresponding mapping between the two subgraphs).

5 Graphs in Pattern Recognition in the last ten years 5 In inexact graph matching, instead, the constraints on edge preservation are relaxed, either because the algorithms attempt to deal with errors in the input graphs (and so we have error-correcting matching) or because, for reducing the computational cost, they search the mapping with a strategy that does not ensure the optimality of the found solutions (approximate or suboptimal matching). For inexact matching, there is not a single formal statement of the problem; instead, different papers often use slightly different formalizations, that may lead to different ways of relaxing the edge preservation constraints. With no pretense at completeness, in the following we will describe two formalizations that have been used by several works. In the first definition, the concept of a mapping function µ is extended so as to include the possibility of mapping a node v to a special, null node denoted as ɛ; thus the mapping is a function µ : V 1 V 2 {ɛ}. We will assume that µ is injective for the nodes of V 1 not mapped to ɛ, v w V 1, µ(v) ɛ µ(v) µ(w) (5) while allowing that several nodes may be mapped to ɛ. With a slightly improper notation, we will say that µ 1 (w) = ɛ to indicate that there is no node v V 1 such that µ(v) = w. Then, the cost of a mapping µ is defined as: C(µ) = C R (v, µ(v)) + C D (v) + C D (w) v V 1 µ(v) ɛ + + (v, w) E 1 (µ(v), µ(w)) E 2 v V 1 µ(v) = ɛ C R((v, w), (µ(v), µ(w))) C D((v, w)) + w V 2 µ 1 (w) = ɛ C D((v, w)) (6) (v, w) E 1 (µ(v), µ(w)) / E 2 (v, w) E 2 (µ 1 (v), µ 1 (w)) / E 1 where C R (.,.) is the cost for the replacement of a node, C D (.) is the cost for the deletion of a node, and C R and C D are the replacement and deletion costs for edges. These cost functions are to be defined according to the application requirements, and usually take into account additional, application-dependent attributes that are attached to nodes and edges. In this formulation, the matching problem is cast as the search of the matching that minimizes the cost C(µ). With an appropriate choice of the cost functions, it can be demonstrated that the exact matching problems defined previously can be seen as special cases of this one, with the additional requirement that the matching cost must be 0. In the second formulation, called weighted graph matching, the graphs are represented through their adjacency matrices; usually the elements of the matrices

6 6 P.Foggia, G.Percannella and M.Vento are not restricted to 0 and 1, but can express a continuous weight for the relation between two nodes: so the generic element A ij of the matrix A is 0 if there is not an edge between nodes i and j, and has otherwise a real value in ]0, 1] denoting the weight for the edge (i, j). Given two graphs represented by their adjacency matrices A and B, a compatibility tensor C ijkl is introduced to measure the compatibility between two edges: { 0 if Aij = 0 or B C ijkl = ij = 0 (7) c(a ij, B kl ) otherwise where c(.,.) is a suitably defined compatibility function. The matching is represented by a matching matrix M, whose elements M ik are 1 if node i of the first graph is matched with node k of the second graph, 0 otherwise. Thus the matching problem is formulated as the search of the matrix M that maximizes the following function: W (M) = i subject to the constraints: M ik M jl C ijkl (8) j k l M ik {0, 1}; i, k M ik 1; k, i M ik 1 (9) Also with this formulation, it can be demonstrated that with a suitable choice of the compatibility function c(.,.), the various forms of exact matching can be seen as a special case. While in the years covered by Ref. 26 the research has explored both exact and inexact matching, the recent work on graphs in the Pattern Recognition community has been mostly focused on inexact graph matching. This may be due to the fact that today the Pattern Recognition research is applying graphs to more complex problems than those that were feasible some years ago, and so it is more frequent the use of larger and noisier graphs Exact Matching While there has been little work on the overall to improve existing exact matching algorithms, some effort has been put to provide a better characterization of the existing methods. As an example, the 2003 paper by De Santo et al. 138 presents an extensive comparative evaluation of four exact algorithms for graph isomorphism and graph-subgraph isomorphism. Most existing exact matching algorithms are based on some form of tree search, where the matching is constructed starting with an empty mapping function and adding a pair of nodes at a time, usually with the possibility of backtracking, and the use of heuristics to avoid the complete exploration of the space of all

7 Graphs in Pattern Recognition in the last ten years 7 the possible matchings. In 2007, Konc and Janežič 87 propose MaxCliqueDyn, an improved algorithm for finding the Maximum Clique (and hence the Maximum Common Subgraph) which uses branch and bound, combined with approximate graph coloring for finding tight bounds in order to prune the search space. In a 2011 paper, J. Ullmann 160 presents a substantial improvement of his own very well known subgraph isomorphism algorithm from The new algorithm incorporates several ideas from the literature on the Binary Constraint Satisfaction Problem, of which the subgraph isomorphism can be considered a special case. Also Zampelli et al. 179 propose a method based on Constraint Satisfaction, which is an extension of the technique introduced by Larrosa and Valiente in A further development of the technique, with the introduction of a better filtering based on the AllDifferent constraint, is proposed by C. Solnon 152 in Among the approaches not based on tree search, we can mention Gori et al. 64, who, in their 2005 paper, propose an isomorphism algorithm that is based on Random Walks, that works only on a class of graphs denoted by the authors as Markovian Spectrally Distinguishable graphs; the authors verify experimentally on a large database of graphs that, as long as the graphs have some kind of irregularity or randomness, the probability of not satisfying this assumption is very low. The 2011 paper by Weber et al. 169 extends the matching algorithm based on the construction of a decision tree by Messmer and Bunke 109, significantly reducing the spatial complexity for graphs whose nodes have a small number of different labels. In their 2004 paper 39, Dickinson et al. discuss the matching problem (graph isomorphism, graph-subgraph isomorphism and maximum common subgraph) for the special case of graphs having unique node labels. Finally, the 2012 paper by Dahm et al. 34 present a technique for speeding up existing exact subgraph isomorphism algorithms on large graphs Inexact matching Inexact matching methods have received comparatively more attention in the research community, both by extending existing techniques and by introducing novel ideas unrelated to previous work. In particular, the extensions of previous methods have interested mostly algorithms based on the reduction of graph matching to a continuous optimization problem, algorithms based on spectral properties of the graphs (i.e. properties related to the eigenvalues and eigenvectors of the adjacency matrix or of other matrices characterizing the graph structure), and methods approximating the solution of the graph matching problem by means of the bipartite graph matching, which is a simpler problem solvable in polynomial time. Many inexact matching algorithms are formulated as an approximate way to compute the Graph Edit Distance. A recent paper by Gao et al. 57 in 2010 presents a survey on this topic. Graph Edit Distance computes the distance between two graphs on the basis of the minimal set of edit operations (e.g. node additions and deletions etc.) needed to transform one graph into the other one. A 2012 paper by

8 8 P.Foggia, G.Percannella and M.Vento Solé-Ribalta et al. 151 provides a theoretical discussion on the relation between the properties of the distance function and the costs assigned to each edit operation. Although in principle the Graph Edit Distance problem is not related to matching, in practice most methods compute the distance by finding a matching for the nodes that are preserved by the edit operations (i.e. those that are not added or removed, but possibly have their label changed); given this matching, the edit distance can be obtained as the sum of a term accounting for the matched nodes and their edges, and a term accounting for the remaining nodes/edges (see Eq. 6 on page 5). So usually the outcome of the algorithm is not only an indication of the distance between the graphs, but also the matching that is supposed to minimize the value of this distance. This is why we have chosen to include some Graph Edit Distance methods in this section Techniques based on tree search Methods based on tree search have been also used for inexact matching. In this case, the adopted heuristics may not ensure that the optimal solution is found, yielding a suboptimal matching. As an example, Sanfeliu et al. in and in , and Serratosa et al. in , extend their previous work on inexact matching of Function-Described Graphs (FDG), that are Attributed Relational Graphs enriched with constraints on the joint probabilities of nodes and edges, used to represent a set of graphs, while in Ref. 141 Serratosa et al. detail how these FDG can be automatically constructed. Cook et al. 29 in 2003 propose the use of beam search, a heuristic search method derived from the A* algorithm, for computing the Graph Edit Distance. The paper by Hidović and Pelillo 73 in 2004 extends the definition of a graph metric based on Maximum Common Subgraph, introduced by Bunke in 1999, so that it can also be applied to graphs with node attributes Continuous optimization While graph matching is inherently a discrete optimization problem, several inexact algorithms have been proposed to reformulate it as a continuous problem (by relaxing some constraints), solve the continuous problem using one of the many available optimization algorithms and then recast the found solution in the discrete domain. Usually the algorithm used for the continuous problem only ensures that a local optimum is found; moreover, since a discretization step is required afterwards, the matching found may not even guarantee to exhibit local optimality. An example of evolution of an existing matching method of this category is given by the 2003 paper by Massaro and Pelillo 107, which improves a previous work on the search of the Maximum Common Subgraphs that use a theorem by Bomze to reformulate this problem as a quadratic optimization in a continuous domain. Zaslavskiy et al. 182 in their 2009 paper present a graph matching algorithm in which the matching is formulated as a convex-concave programming problem

9 Graphs in Pattern Recognition in the last ten years 9 which is solved by interpolating between two approximate simpler formulations. Also the 2011 paper by Rota Bulò et al. 134 is based on the same formulation of graph matching; in this case the authors solve the quadratic optimization problem using infection-immunization dynamics, a new iterative algorithm based on evolutionary game theory. The 2002 paper by van Wyk et al. 164 addresses the problem of Attributed Graph matching as a parameter identification problem, and propose the use of a Reproducing Kernel Hilbert Space interpolator (RKHS) to solve this problem. The 2003 paper by van Wyk and van Wyk 161 extends the previous method by providing a more general formulation of the problem. The same authors in a 2004 paper 163 further generalize the method, presenting a kernel-based framework for graph matching which include as special cases the previous two algorithms. In 2004, van Wyk and van Wyk 162 present a graph matching algorithm based on the Projections Onto Convex Sets approach. The 2006 paper by Justice and Hero 81 proposes a reformulation of the Graph Edit Distance as Binary Linear Programming problem, for which they provide upper and lower bounds in polynomial time. Kostin et al. 89 in 2005 present an extension of the probabilistic relaxation algorithm by Christmas et al. 25. Chevalier et al. 22 propose in a 2007 paper a technique that integrates probabilistic relaxation with bipartite graph matching, applied to Region Adjacency Graphs. In their 2008 paper 157, Torresani et al. introduce an algorithm based on a technique called dual decomposition: the matching problem (in a continuous reformulation) is decomposed into a set of simpler problems, depending on a parameter vector; the simpler problems can be solved providing a lower bound to the minimization of the functional to be optimized. Then the algorithm searches for the tightest bound by varying the parameter vector. Caetano et al. 19 propose in 2009 a technique in which the functional to be optimized has a parametric form, and the authors propose a training phase to learn these parameters. In a 2011 paper 21, Chang and Kimia present an extension of the Graduated Assigment Graph Matching by Gold and Rangarajan, modified so as to work on hypergraphs instead of graphs. Zhou and De la Torre 184 present a method called factorized graph matching in which the affinity matrix used to define the functional to be optimized is factored into a Kronecker product of smaller matrices, separately encoding the structure of the graphs and the affinities between nodes and between edges. The authors propose an optimization method based on this factorization that leads to an improvement in space and time requirements. Sanromà et al. 137 in 2012 propose a special purpose, probabilistic graph matching method for graphs representing sets of 2D points, based on the Expectation Maximization (EM) algorithm. Solé-Ribalta and Serratosa 149 in their 2011 paper propose two sub-optimal algorithms for the common labelling problem, a generalization of inexact graph matching in which the number of graphs is larger than two (the problem cannot be reduced to several pairwise matchings). The first proposed algorithm uses an extension of Graduated Assignment, while the second is based on a probabilistic formulation and adopts an iterative approach somewhat similar to Probabilistic

10 10 P.Foggia, G.Percannella and M.Vento Relaxation. A 2011 paper by Rodenas et al. 131 presents a parallelized version of the first algorithm. A 2013 paper by Solé-Ribalta and Serratosa 150 present a further development of the first algorithm, based on the matching of the nodes of all graphs to a virtual node set Spectral methods Spectral matching methods are based on the observation that the eigenvalues of a matrix do not change if the rows and columns are permuted. Thus, given the matrix representations of two isomorphic graphs (for instance, their adjacency matrices), they have the same eigenvalues. The converse is not true; so, spectral methods are inexact in the sense that they do not ensure the optimality of the solution found. Caelli and Kosinov 18,16 in 2004 propose a matching algorithm that uses the graph eigenvectors to define a vector space onto which the graph nodes are projected; a clustering algorithm in this vector space is used to find possible matches. Also Robles-Kelly and Hancock 130, in a 2007 paper, propose the embedding of graph nodes into a different space (a Riemannian manifold) using spectral properties. The 2004 and 2005 papers by Robles-Kelly and Hancock 129,128 present a graph matching approach based on Spectral Seriation of graphs: the adjacency matrix is transformed into a sequence using spectral properties, then the matching is performed by computing the String Edit Distance between these sequences. Cour et al. 31 in 2007 propose a spectral matching method called balanced graph matching, using a novel relaxation scheme that naturally incorporates matching constraints. The authors also introduce a normalization technique that can be used to improve several other algorithms such as the classical Graduated Assignment Graph Matching by Gold and Rangarajan. Cho et al. 23 propose a reformulation of the inexact graph matching as a random walk problem, and show that this formalization provides a theoretical interpretation of both spectral methods and of some other techniques based on continuous optimization; in this framework, the authors present an original algorithm based on techniques commonly used for Web ranking. In a 2006 paper Qiu and Hancock 118 present an approximate, hierarchical method for graph matching that uses spectral properties to partition each graph into non-overlapping subgraphs, which are then matched separately, with a significant reduction of the matching time. The same authors present a somewhat similar idea in a 2007 paper 119, where the partition is based on commute times, which can be computed from the Laplacian spectrum of the graph. Wilson and Zhu 171 in their 2008 paper present a survey of different techniques for the spectral representation of graphs and trees. In 2011, Escolano et al. 45 propose a matching method based on the representation of a graph as a bag of partial node coverages, described using spectral features. In 2011, Duchenne et al. 40 present a generalization of spectal matching techniques to hypergraphs, using some results from tensor algebra.

11 Graphs in Pattern Recognition in the last ten years Other approaches Among other techniques used for inexact matching, Bagdanov and Worring 2 in a 2003 paper introduce a matching algorithm based on bipartite matching for the so-called First Order Gaussian Graphs (FOGG), which are an extension of random graphs having Gaussian random variables as their node attributes. Also the paper by M. Skomorowski 147 in 2007 presents a pattern recognition algorithm based on a variant of random graphs, using for the matching a syntactic approach based on graph grammars. The 2003 paper by Park et al. 116 addresses the problem of partial matching between a model graph and a larger image graph by combining a probabilistic formulation similar to the one used in probabilistic relaxation with a greedy search technique. In a 2006 paper, Conte et al. 27 present an inexact matching technique for pyramidal graph structures, which is based on weighted bipartite graph matching, but use information from the upper levels of a pyramid to constrain the matching of the lower levels. Xiao et al. 174 in 2008 propose a graph distance based on a vector representation called Substructure Abundance Vector (SAV ), that can be considered as an extension of the graph distance based on Maximum Common Subgraph (MCS). The paper by S. Auwatanamongkol 1 in 2007 proposes a genetic algorithm for a special case of inexact matching, where the nodes are associated to 2D points. Bourbakis et al. 9 in 2007 introduce the so called Local-Global graphs (L-G graphs), as an extension of Region Adjacency graphs in which the edges are obtained through a Delaunay triangulation, for which they introduce an inexact, suboptimal matching algorithm which is based on a greedy search. In 2002, Wang et al. 168 present a polynomial algorithm for the inexact graph-subgraph matching for the special case of undirected acyclic graphs. The 2004 paper by Sebastian et al. 140 presents a Graph Edit Distance algorithm for the special case of shock graphs, based on dynamic programming. In their 2008 paper 4, Bai and Latecki propose an inexact suboptimal matching algorithm for skeleton graphs, based on the use of bipartite graph matching. Chowdury et al. 24 in a 2009 paper combine weighted bipartite graph matching with the use of the automorphism groups for the cycles contained in the graph, to improve the accuracy of the matching found. A 2009 paper by Riesen and Bunke 125 proposes an approximation of Graph Edit Distance with the use of Bipartite Graph Matching, solved using the Munkres algorithm. The 2010 paper by Kim et al. 83 approximates the matching between Attributed Relational Graphs using the nested assignment problem: an inner assigment step is used to find the best matching of the adjacent edges; this information is used then to define a matching cost for the nodes, and an outer assigment step finds the node matchings that minimizes the sum of these costs. This double application of the assignment problem is the original aspect of this method, differentiating it, for instance, from the heuristic proposed by Riesen and Bunke. Also Raveaux et al. 121 in 2010 present an approximate algorithm based on bipartite graph matching; in this case the aim is to compute an approximation of the Graph Edit Distance, and the bipartite matching is performed between small subgraphs of each of the two

12 12 P.Foggia, G.Percannella and M.Vento graphs. In 2011, Fankhauser et al. 46 present an algorithm for computing the graph edit distance using bipartite graph matching, solved using the algorithm by Volgenant and Jonker. The same authors in propose an suboptimal technique for graph isomorphism, also based on bipartite graph matching. The algorithm has the distinctive feature that it either finds an exact solution, or it rejects the pair of graphs; thus a slower algorithm can be used for the cases not covered by the proposed method. Tang et al. 155 in 2011 propose a graph matching algorithm based on the Dot Product Representation of Graphs (DPRG) proposed by Scheinerman and Tucker 139, which represents each node using a numeric vector chosen so that each edge value corresponds approximately to the dot product of the nodes connected by the edge; the choice of the node vectors is formulated as a continuous optimization problem. The proposed method is extended in a 2012 paper by the same authors 156. The 2011 paper by Macrini et al. 104 proposes a matching algorithm for bone graphs, which are a representation for 3D shapes, using weighted bipartite graph matching. The paper by Jiang et al. 78 in 2011 presents a technique for inexact subgraph isomorphism based on geometric hashing, requiring very little computational cost for the intended use case of searching for several small input graphs within a large reference graph. A novel optimization technique, Estimation of Distributions Algorithms (EDA), has been succesfully used for inexact graph matching. EDA are somewhat similar to genetic/evolutive algorithms, but the parameters of each tentative solution are considered as random variables; a stochastic sampling process is used to produce the next generation. The paper by Bengoetxea et al. 5 in 2002 proposes the use of EDA for inexact, suboptimal graph matching, by associating each node of the first graph to a random variable whose possible values are the nodes of the second graph. In 2005, Cesar et al. 80 formulate the inexact graph homomorphism as a discrete optimization problem, and compare beam search, genetic algoritms and EDA for solving this problem. A different approach, also based on a probabilistic framework, is proposed by Caelli and Caetano in ; the matching is formulated as an inference problem on a Hidden Markov Random Field (HMRF ), for which an approximate solution is computed. The 2004 paper by Dickinson et al. 38 defines a graph similarity measure for the special case of graphs having unique node labels, and proposes a hierarchical algorithm to efficiently compute this measure. He et al. 72 in 2004 propose an ad hoc matching algorithm for skeleton graphs, that performs a linearization of the graphs, and then uses string matching to find an inexact correspondence. A similar approach is presented in the paper by Das et al. 35 in 2012 for graphs obtained by fingerprints. In 2008, Gao et al. 56 introduce a Graph Distance algorithm for the special case of graphs whose nodes represent points in a 2D space, based on the Earth Mover Distance (EMD). The 2009 paper by Emms et al. 44 presents an

13 Graphs in Pattern Recognition in the last ten years 13 original approach to graph matching based on quantum computing, that uses the inherent parallelism of some quantum physics phenomena if run on a (hypothetical) quantistic computer. 3. Other problems In this section we will present some recent developments on graph problems that are not, in a strict sense, forms of graph matching, but are related to matching either because they provide a way of comparing two graphs (this is the case for graph embeddings and graph kernels), or because they use a graph-based approach to group input patterns into classes (in an unsupervised way for graph clustering, and in a supervised or semi-supervised way for graph learning). We also mention some works on other graph-related problems which are of specific interest as Pattern Recognition basic tools, such as dimensionality reduction. Graph embeddings and graph kernels are perhaps the most significant novelty in graph-based Pattern Recognition in the recent years. Although seminal works on these fields were already present in earlier literature, it is in the last decade that these techniques have gained popularity in the Pattern Recognition community. Gaertner et al. 55 presents an early survey on kernels applied to non-vectorial data. Bunke et al. 12 in 2005 present a survey of graph kernels and other graph-related techniques. Bunke and Riesen 14 in their 2011 paper propose a useful review on the topic of graph kernels and graph embeddings; the same authors in extend this review and present these techniques as a way to unify the statistical and structural approaches in Pattern Recognition. Please note that, although it may seem that graph embeddings and kernels could help reducing the computational complexity of graph comparison, many of the proposed algorithm have a cost that is equal to or higher than traditional matching methods (for instance, some embedding methods require computing the Graph Edit Distance, while others involve a cost that is related to the number of graphs in the considered set). The main benefit of the novel techniques is instead in the availability of the large corpus of theoretically sound techniques from statistical Pattern Recognition Graph embeddings In the literature the term Graph embedding is used with two slightly different meanings: a technique that maps the nodes of a graph onto points in a vector space, in such a way that nodes having similar structural properties (e.g. the structure of their neighborhood) will be mapped onto points which are close in this space; a technique that maps whole graphs onto points in a vector space, in such a way that similar graphs are mapped onto close points.

14 14 P.Foggia, G.Percannella and M.Vento Fig. 2. Graph Embedding: the mapping between graphs and points in a vector space is represented by the graph name. The References 18, 16, 130, 45, discussed previously in Section 2.2, are an example of the first kind; also, the Dot Product Representation of Graphs 139 mentioned in Section 2.2 belongs to this category. Yan et al. 175 show in their 2007 paper that most commonly used dimensionality reduction techniques can be formulated as a graph embedding algorithm of this kind. Their work is the basis for an embedding technique proposed by You et al. 177, called General Solution for Supervised Graph Embedding (GSSGE). In the following subsections we will mainly concentrate on the second kind of graph embedding, presenting the relevant methods categorized according to the main properties they attempt to preserve in the mapping Isometric embeddings Methods in this category start from a distance or similarity measure between graphs, and attempt to find a mapping to vectors that preserve this measure. E. Bonabeau 6, in a 2002 paper, proposes a technique based on a Self-Organizing Map (SOM ), an unsupervised neural network adopting competitive learning, in order to map graphs onto a bidimensional plane. Although the term embedding is not explicitly used, it can be considered a form of graph embedding. The mapping found by the network is used both as an aid for the visualization of the data represented by the graphs, and for clustering. Also the 2003 paper by de Mauro et al. 36 uses a Neural Network for graph embedding. In particular, the proposed method works on directed acyclic graphs, and uses a Recursive Neural Network. The network is trained by similarity learning: the training set is made by pairs of graphs which have been manually labeled with a similarity value, and the network aims to produce an output vector for each graph so that the Euclidean distance between vectors is consistent with the similarity between the corresponding graphs. A recent paper by Jouili and Tabbone 79 proposes a graph embedding technique based on constant shift embedding, a framework proposed for the embedding of non-

15 Graphs in Pattern Recognition in the last ten years 15 metric spaces, mainly applied to clustering problems Spectral embeddings The embedding algorithms in this subsection are based on the exploitation of spectral properties of graphs, i.e. properties related to the eigenvalues and eigenvectors of matrices representing the graphs, such as the adjacency matrix. Since spectral properties are invariant with respect to node permutations, they ensure that graphs with an isomorphic structure will be mapped to the same vectors. Luo et al. 101 in a 2003 paper propose the use of spectral features for graph embedding; in particular, they decompose the adjacency matrix of a graph into its principal eigenmodes, and then compute from them a vector of numerical features (e.g. eigenmode volume, eigenmode perimeter, inter-eigenmode distances etc.). Also the 2005 paper by Wilson et al. 170 uses spectral properties to define a graph embedding; in this case, the authors derive a set of polynomials from the spectral decomposition of the Laplacian of the adjacency matrix, and use the coefficients of these polynomials as feature vectors. Also the 2009 paper by Xiao et al. 172 proposes a graph embedding based on spectral properties; in particular the method uses the heat kernel, i.e. the solution of the heat equation on the graph, to obtain a set of invariant properties used to obtain a vector representation of the graph. Xiao et al. 173 in a 2011 paper present an embedding for hierachical graphs, obtained by a hierarchical segmentation of images. Spectral features are computed on the levels of the hierarchy, obtaining a fixed size feature vector Subpattern embeddings These methods are based on the detection, or the enumeration, of specific types of subpatterns within the graphs to be embedded. Torsello and Hancock 158 in 2007 propose an embedding algorithm for trees. The algorithm requires that all the trees to be embedded are known in advance. The embedding is based on the construction of a Union Tree, which is a directed, acyclic graph having all the considered trees as subgraphs; then each tree is represented by a vector that encodes which nodes of the Union Tree are used by the tree. W. Czech 33 proposes in a 2011 paper an embedding method based on B- matrices, which are a structure based on the path lengths between the nodes of a graph and are invariant with respect to node permutations. A recent paper by Luqman et al. 103 presents a fuzzy multilevel embedding technique, that combines structural information of the graph and information from the graph attributes using fuzzy histograms. The method uses an unsupervised learning phase to find the fuzzy intervals used in the representation. In a 2011 paper, Gibert et al. 60 present a graph embedding based on graphs of words, which are an extension of the popular bag of words approach. The method

16 16 P.Foggia, G.Percannella and M.Vento assumes that the graphs are obtained from images, with nodes corresponding to salient points, and node attributes corresponding to visual descriptors of the points. The method performs a quantization of the attribute space, constructing a codebook. This codebook is used to produce an intermediate graph, called graph of words, whose nodes are the codebook values, and whose edges correspond to the adjacency in the original graph of nodes mapped to those codebook values. The nodes and edges of the intermediate graph are labeled with the counts of the corresponding nodes/edges of the original graph; then an histogram of these counts is used as the embedding. Two 2012 papers by the same authors further develop this method: in Ref. 62 the authors add a more sofisticated procedure for constructing the codebook, while in Ref. 61 they use a large set of features and apply a feature selection algorithm to determine the most significant ones. The same authors, in a 2013 paper 63, propose a somewhat similar embedding technique, that removes the assumptions that the graphs are obtained from images, and exploits also edge attributes if they are present. The 2010 paper by Richiardi et al. 122 proposes two graph embedding techniques specifically tailored for graphs having the following constraints: the number of nodes is fixed across all the considered set of graphs, and a total ordering is defined in the set of nodes. The authors show that a graph embedding exploiting these constraints can outperform a more general one Prototype-based embeddings These embedding methods assume that a set of prototype graphs is available, and the mapping of a graph onto a vector space is based on the distances (obtained according to a suitably defined distance function) of the graph from the prototypes. This technique can be seen as a special case of the dissimilarity representations introduced by Pekalska and Duin 117. The first of these methods has been proposed in 2007 by Riesen et al. 127 The method has one prototype graph for each dimension of the vector space; the corresponding component of the vector is simply defined as the Graph Edit Distance between the prototype and the graph to be embedded. The authors discuss several strategies for choosing the prototypes from a training set, and evaluate them by using the embedding for several classification tasks. In the same year, a paper by Riesen and Bunke 124 further develops this idea by proposing the use of several sets of randomly chosen prototypes, and combining the classifiers obtained for each of the corresponding embeddings to form a Multiple Classifier System. The advantage is that the resulting classifier is more robust with respect to the risk of a poor choice for the prototypes. A 2009 paper by Lee and Duin 93 explores a similar idea, but instead of a random selection of the prototypes, the proposed method creates different base classifiers by using node label information for extracting different sets of subgraphs from the training set. In 2010, Lee et al. 94 propose a similar method in which, instead of extracting subgraphs, the node label information is used to alter

17 Graphs in Pattern Recognition in the last ten years 17 the training graphs without changing their size. In a 2009 paper, Riesen and Bunke 123 present a Lipschitz embedding for graphs. Lipschitz embedding is usually employed to regularize vector spaces, but in this case it is proposed as a method to construct a graph embedding. Basically, each component of the vector representation of a graph is deduced from a set of prototype graphs; the value of the component is the mean distance (using Graph Edit Distance) with the corresponding set of prototypes (a different aggregation function than the mean could be used). The sets of prototypes are constructed using a clustering of a training set, based on the k Medoids clustering algorithm. The same authors in another 2009 paper 126 propose a method for reducing the dimensionality of this embedding, by using Principal Component Analysis and Linear Discriminant Analysis. Bunke and Riesen 13 in 2011 propose an extension to this technique, which formulates the problem of choosing the reference graphs as a feature selection: a first embedding is built using a large number of reference graphs; then a feature selection algorithm is applied to the obtained vectors in order to select the most significant features, and only the reference graphs corresponding to these features are retained. Also the 2012 paper by Borzeshi et al. 8 addresses the problem of selecting the reference graphs for graph embedding. The authors present several algorithms which are based on a discriminative approach: they define several objective functions to measure how much the prototypes are able to discriminate between classes, and select the prototypes by a greedy optimization of these functions Graph kernels A graph kernel is a function that maps a couple of graphs onto a real number, and has similar properties to the dot product defined on vectors. More formally, if we denote with G the space of all the graphs, a graph kernel is a function k such as: k : G G R (10) k(g 1, G 2 ) = k(g 2, G 1 ) G 1, G 2 G (11) n n i=1 j=1 c i c j k(g i, G j ) 0 G 1,..., G 2 G, c 1,..., c n R (12) Equation 11 requires the function k to be symmetric, while Equation 12 requires it to be positive semi-definite. Informally, a graph kernel can be considered as a measure of the similarity between two graphs; however its formal properties allow a kernel to replace the vector dot product in several vector-based algorithms that use this operator (and other functions related to dot product, such as the Euclidean norm). Among the many Pattern Recogniton techniques that can be adapted to graphs using kernels we mention Support Vector Machine classifiers and Principal Component Analysis.

18 18 P.Foggia, G.Percannella and M.Vento Kernels have been used for a long time to extend to the nonlinear case linear algorithms working on vector spaces, thanks to the Mercer s theorem: given a kernel function defined on a compact Hausdorff space X, there is a vector space V and a mapping between X and V such that the value of the kernel computed on two points in X is equal to the dot product of the corresponding points in V. Thus a kernel can be seen as an implicit way of performing an embedding into a vector space. Although Mercer s theorem does not apply to graph kernels, in practice these latter can be used as a theoretically sound way to extend a vector algorithm to graphs. Of course, the performance of these algorithms strongly depend on the appropriateness (with respect to the task at hand) of the notion of similarity embodied in the graph kernel. In their 2003 paper, Kashima et al. 82 specialize for the graph domain the idea of marginalized kernels, a probabilistic technique for defining a kernel based on the introduction of hidden variables. In this case, the hidden variable is a sequence of node indices, generated according to a random walk on one of the graphs. Given a value for the hidden variable, a kernel on sequences is computed using the sequence of visited nodes and edges; the marginalized kernel is obtained by computing the expected value (with respect to the joint distribution of the hidden and visible variables) of this sequence kernel. Mahé and Vert 105 in 2009 extend this technique to trees, and present an application to molecular data. Borgwardt and Kriegel 7 in 2005 present a graph kernel that is based on paths, instead of walks (a path is a walk without repeated nodes); in order to avoid the exponential cost of enumerating all the paths in a graph, the authors propose a scheme to use only the shortest path between any pair of nodes, since the shortest paths can be computed in polynomial time. Neuhaus and Bunke 112, in their 2006 paper, define three graph kernels based on Graph Edit Distance. The first kernel requires the choice of a zero pattern, a graph that, with respect to the kernel, will behave similarly to a null vector. The authors show that this kernel fulfils the theoretical requirements of a kernel function, but its practical performance is strongly affected by the choice of the zero pattern. The authors then introduce two other kernels, obtained from the sum and the product of the first kernel over a set of zero patterns, and show that they have the same theoretical properties, but are more robust with respect to the choice of these patterns. In their 2009 paper, Neuhaus et al. 114 present three possible ways to use Graph Edit Distance in the definition of a kernel. The first way is a diffusion kernel, which turns an edit distance matrix into a positive definite matrix satisfying the kernel properties, but has the inconvenience that the set of graphs to which it is applied must be finite and known a priori. The second way is a convolution kernel, which is based on a decomposition of the edit path between the two graphs into a sequence of substitution operations; given a kernel for individual substitutions, this approach provides a definition for a kernel between two graphs. The main drawback is the exponential complexity with respect to the number of nodes, for which the authors

Graph Embedding in Vector Spaces

Graph Embedding in Vector Spaces Graph Embedding in Vector Spaces GbR 2011 Mini-tutorial Jaume Gibert, Ernest Valveny Computer Vision Center, Universitat Autònoma de Barcelona, Barcelona, Spain Horst Bunke Institute of Computer Science

More information

A Study of Graph Spectra for Comparing Graphs

A Study of Graph Spectra for Comparing Graphs A Study of Graph Spectra for Comparing Graphs Ping Zhu and Richard C. Wilson Computer Science Department University of York, UK Abstract The spectrum of a graph has been widely used in graph theory to

More information

Learning Generative Graph Prototypes using Simplified Von Neumann Entropy

Learning Generative Graph Prototypes using Simplified Von Neumann Entropy Learning Generative Graph Prototypes using Simplified Von Neumann Entropy Lin Han, Edwin R. Hancock and Richard C. Wilson Department of Computer Science The University of York YO10 5DD, UK Graph representations

More information

CLASSIFICATION AND CLUSTERING OF GRAPHS BASED ON DISSIMILARITY SPACE EMBEDDING

CLASSIFICATION AND CLUSTERING OF GRAPHS BASED ON DISSIMILARITY SPACE EMBEDDING CLASSIFICATION AND CLUSTERING OF GRAPHS BASED ON DISSIMILARITY SPACE EMBEDDING Horst Bunke and Kaspar Riesen {bunke,riesen}@iam.unibe.ch Institute of Computer Science and Applied Mathematics University

More information

Bipartite Graph Partitioning and Content-based Image Clustering

Bipartite Graph Partitioning and Content-based Image Clustering Bipartite Graph Partitioning and Content-based Image Clustering Guoping Qiu School of Computer Science The University of Nottingham qiu @ cs.nott.ac.uk Abstract This paper presents a method to model the

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Fuzzy Multilevel Graph Embedding for Recognition, Indexing and Retrieval of Graphic Document Images

Fuzzy Multilevel Graph Embedding for Recognition, Indexing and Retrieval of Graphic Document Images Cotutelle PhD thesis for Recognition, Indexing and Retrieval of Graphic Document Images presented by Muhammad Muzzamil LUQMAN mluqman@{univ-tours.fr, cvc.uab.es} Friday, 2 nd of March 2012 Directors of

More information

EXACT AND INEXACT GRAPH MATCHING: METHODOLOGY AND APPLICATIONS

EXACT AND INEXACT GRAPH MATCHING: METHODOLOGY AND APPLICATIONS Chapter 7 EXACT AND INEXACT GRAPH MATCHING: METHODOLOGY AND APPLICATIONS Kaspar Riesen Institute of Computer Science and Applied Mathematics, University of Bern Neubr-uckstrasse 10, CH-3012 Bern, Switzerland

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

Structural and Syntactic Pattern Recognition

Structural and Syntactic Pattern Recognition Structural and Syntactic Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent

More information

Symbol Detection Using Region Adjacency Graphs and Integer Linear Programming

Symbol Detection Using Region Adjacency Graphs and Integer Linear Programming 2009 10th International Conference on Document Analysis and Recognition Symbol Detection Using Region Adjacency Graphs and Integer Linear Programming Pierre Le Bodic LRI UMR 8623 Using Université Paris-Sud

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM

CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM 96 CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM Clustering is the process of combining a set of relevant information in the same group. In this process KM algorithm plays

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Eric Xing Lecture 14, February 29, 2016 Reading: W & J Book Chapters Eric Xing @

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques

Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques M. Lazarescu 1,2, H. Bunke 1, and S. Venkatesh 2 1 Computer Science Department, University of Bern, Switzerland 2 School of

More information

Pattern Recognition Using Graph Theory

Pattern Recognition Using Graph Theory ISSN: 2278 0211 (Online) Pattern Recognition Using Graph Theory Aditya Doshi Department of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India Manmohan Jangid Department of

More information

9. Support Vector Machines. The linearly separable case: hard-margin SVMs. The linearly separable case: hard-margin SVMs. Learning objectives

9. Support Vector Machines. The linearly separable case: hard-margin SVMs. The linearly separable case: hard-margin SVMs. Learning objectives Foundations of Machine Learning École Centrale Paris Fall 25 9. Support Vector Machines Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech Learning objectives chloe agathe.azencott@mines

More information

Improving Image Segmentation Quality Via Graph Theory

Improving Image Segmentation Quality Via Graph Theory International Symposium on Computers & Informatics (ISCI 05) Improving Image Segmentation Quality Via Graph Theory Xiangxiang Li, Songhao Zhu School of Automatic, Nanjing University of Post and Telecommunications,

More information

Heat Kernels and Diffusion Processes

Heat Kernels and Diffusion Processes Heat Kernels and Diffusion Processes Definition: University of Alicante (Spain) Matrix Computing (subject 3168 Degree in Maths) 30 hours (theory)) + 15 hours (practical assignment) Contents 1. Solving

More information

Introduction to Graph Theory

Introduction to Graph Theory Introduction to Graph Theory Tandy Warnow January 20, 2017 Graphs Tandy Warnow Graphs A graph G = (V, E) is an object that contains a vertex set V and an edge set E. We also write V (G) to denote the vertex

More information

Invariant shape similarity. Invariant shape similarity. Invariant similarity. Equivalence. Equivalence. Equivalence. Equal SIMILARITY TRANSFORMATION

Invariant shape similarity. Invariant shape similarity. Invariant similarity. Equivalence. Equivalence. Equivalence. Equal SIMILARITY TRANSFORMATION 1 Invariant shape similarity Alexer & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book 2 Invariant shape similarity 048921 Advanced topics in vision Processing Analysis

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining D.Kavinya 1 Student, Department of CSE, K.S.Rangasamy College of Technology, Tiruchengode, Tamil Nadu, India 1

More information

Simplicial Global Optimization

Simplicial Global Optimization Simplicial Global Optimization Julius Žilinskas Vilnius University, Lithuania September, 7 http://web.vu.lt/mii/j.zilinskas Global optimization Find f = min x A f (x) and x A, f (x ) = f, where A R n.

More information

Bioinformatics - Lecture 07

Bioinformatics - Lecture 07 Bioinformatics - Lecture 07 Bioinformatics Clusters and networks Martin Saturka http://www.bioplexity.org/lectures/ EBI version 0.4 Creative Commons Attribution-Share Alike 2.5 License Learning on profiles

More information

Preface MOTIVATION ORGANIZATION OF THE BOOK. Section 1: Basic Concepts of Graph Theory

Preface MOTIVATION ORGANIZATION OF THE BOOK. Section 1: Basic Concepts of Graph Theory xv Preface MOTIVATION Graph Theory as a well-known topic in discrete mathematics, has become increasingly under interest within recent decades. This is principally due to its applicability in a wide range

More information

Image Resizing Based on Gradient Vector Flow Analysis

Image Resizing Based on Gradient Vector Flow Analysis Image Resizing Based on Gradient Vector Flow Analysis Sebastiano Battiato battiato@dmi.unict.it Giovanni Puglisi puglisi@dmi.unict.it Giovanni Maria Farinella gfarinellao@dmi.unict.it Daniele Ravì rav@dmi.unict.it

More information

Geometric-Edge Random Graph Model for Image Representation

Geometric-Edge Random Graph Model for Image Representation Geometric-Edge Random Graph Model for Image Representation Bo JIANG, Jin TANG, Bin LUO CVPR Group, Anhui University /1/11, Beijing Acknowledgements This research is supported by the National Natural Science

More information

Spectral Methods for Network Community Detection and Graph Partitioning

Spectral Methods for Network Community Detection and Graph Partitioning Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

Generic object recognition using graph embedding into a vector space

Generic object recognition using graph embedding into a vector space American Journal of Software Engineering and Applications 2013 ; 2(1) : 13-18 Published online February 20, 2013 (http://www.sciencepublishinggroup.com/j/ajsea) doi: 10.11648/j. ajsea.20130201.13 Generic

More information

Reflexive Regular Equivalence for Bipartite Data

Reflexive Regular Equivalence for Bipartite Data Reflexive Regular Equivalence for Bipartite Data Aaron Gerow 1, Mingyang Zhou 2, Stan Matwin 1, and Feng Shi 3 1 Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada 2 Department of Computer

More information

A Parallel Algorithm for Finding Sub-graph Isomorphism

A Parallel Algorithm for Finding Sub-graph Isomorphism CS420: Parallel Programming, Fall 2008 Final Project A Parallel Algorithm for Finding Sub-graph Isomorphism Ashish Sharma, Santosh Bahir, Sushant Narsale, Unmil Tambe Department of Computer Science, Johns

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

Graphs: Introduction. Ali Shokoufandeh, Department of Computer Science, Drexel University

Graphs: Introduction. Ali Shokoufandeh, Department of Computer Science, Drexel University Graphs: Introduction Ali Shokoufandeh, Department of Computer Science, Drexel University Overview of this talk Introduction: Notations and Definitions Graphs and Modeling Algorithmic Graph Theory and Combinatorial

More information

On Universal Cycles of Labeled Graphs

On Universal Cycles of Labeled Graphs On Universal Cycles of Labeled Graphs Greg Brockman Harvard University Cambridge, MA 02138 United States brockman@hcs.harvard.edu Bill Kay University of South Carolina Columbia, SC 29208 United States

More information

Spectral Clustering and Community Detection in Labeled Graphs

Spectral Clustering and Community Detection in Labeled Graphs Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu

More information

Community Detection. Community

Community Detection. Community Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,

More information

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM 20 CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM 2.1 CLASSIFICATION OF CONVENTIONAL TECHNIQUES Classical optimization methods can be classified into two distinct groups:

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY

A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY KARL L. STRATOS Abstract. The conventional method of describing a graph as a pair (V, E), where V and E repectively denote the sets of vertices and edges,

More information

Contextual priming for artificial visual perception

Contextual priming for artificial visual perception Contextual priming for artificial visual perception Hervé Guillaume 1, Nathalie Denquive 1, Philippe Tarroux 1,2 1 LIMSI-CNRS BP 133 F-91403 Orsay cedex France 2 ENS 45 rue d Ulm F-75230 Paris cedex 05

More information

STATISTICS AND ANALYSIS OF SHAPE

STATISTICS AND ANALYSIS OF SHAPE Control and Cybernetics vol. 36 (2007) No. 2 Book review: STATISTICS AND ANALYSIS OF SHAPE by H. Krim, A. Yezzi, Jr., eds. There are numerous definitions of a notion of shape of an object. These definitions

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

A Partition Method for Graph Isomorphism

A Partition Method for Graph Isomorphism Available online at www.sciencedirect.com Physics Procedia ( ) 6 68 International Conference on Solid State Devices and Materials Science A Partition Method for Graph Isomorphism Lijun Tian, Chaoqun Liu

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur Lecture : Graphs Rajat Mittal IIT Kanpur Combinatorial graphs provide a natural way to model connections between different objects. They are very useful in depicting communication networks, social networks

More information

Robust Shape Retrieval Using Maximum Likelihood Theory

Robust Shape Retrieval Using Maximum Likelihood Theory Robust Shape Retrieval Using Maximum Likelihood Theory Naif Alajlan 1, Paul Fieguth 2, and Mohamed Kamel 1 1 PAMI Lab, E & CE Dept., UW, Waterloo, ON, N2L 3G1, Canada. naif, mkamel@pami.uwaterloo.ca 2

More information

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used. 1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when

More information

On Soft Topological Linear Spaces

On Soft Topological Linear Spaces Republic of Iraq Ministry of Higher Education and Scientific Research University of AL-Qadisiyah College of Computer Science and Formation Technology Department of Mathematics On Soft Topological Linear

More information

A Unified Framework to Integrate Supervision and Metric Learning into Clustering

A Unified Framework to Integrate Supervision and Metric Learning into Clustering A Unified Framework to Integrate Supervision and Metric Learning into Clustering Xin Li and Dan Roth Department of Computer Science University of Illinois, Urbana, IL 61801 (xli1,danr)@uiuc.edu December

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

Relative Constraints as Features

Relative Constraints as Features Relative Constraints as Features Piotr Lasek 1 and Krzysztof Lasek 2 1 Chair of Computer Science, University of Rzeszow, ul. Prof. Pigonia 1, 35-510 Rzeszow, Poland, lasek@ur.edu.pl 2 Institute of Computer

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Unsupervised Learning: Clustering Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com (Some material

More information

Machine Learning : Clustering, Self-Organizing Maps

Machine Learning : Clustering, Self-Organizing Maps Machine Learning Clustering, Self-Organizing Maps 12/12/2013 Machine Learning : Clustering, Self-Organizing Maps Clustering The task: partition a set of objects into meaningful subsets (clusters). The

More information

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Murhaf Fares & Stephan Oepen Language Technology Group (LTG) September 27, 2017 Today 2 Recap Evaluation of classifiers Unsupervised

More information

Journal of Asian Scientific Research FEATURES COMPOSITION FOR PROFICIENT AND REAL TIME RETRIEVAL IN CBIR SYSTEM. Tohid Sedghi

Journal of Asian Scientific Research FEATURES COMPOSITION FOR PROFICIENT AND REAL TIME RETRIEVAL IN CBIR SYSTEM. Tohid Sedghi Journal of Asian Scientific Research, 013, 3(1):68-74 Journal of Asian Scientific Research journal homepage: http://aessweb.com/journal-detail.php?id=5003 FEATURES COMPOSTON FOR PROFCENT AND REAL TME RETREVAL

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline

More information

Diffusion Wavelets for Natural Image Analysis

Diffusion Wavelets for Natural Image Analysis Diffusion Wavelets for Natural Image Analysis Tyrus Berry December 16, 2011 Contents 1 Project Description 2 2 Introduction to Diffusion Wavelets 2 2.1 Diffusion Multiresolution............................

More information

Sparse Matrices Reordering using Evolutionary Algorithms: A Seeded Approach

Sparse Matrices Reordering using Evolutionary Algorithms: A Seeded Approach 1 Sparse Matrices Reordering using Evolutionary Algorithms: A Seeded Approach David Greiner, Gustavo Montero, Gabriel Winter Institute of Intelligent Systems and Numerical Applications in Engineering (IUSIANI)

More information

Generalized trace ratio optimization and applications

Generalized trace ratio optimization and applications Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO

More information

2. LITERATURE REVIEW

2. LITERATURE REVIEW 2. LITERATURE REVIEW CBIR has come long way before 1990 and very little papers have been published at that time, however the number of papers published since 1997 is increasing. There are many CBIR algorithms

More information

Pattern Mining in Frequent Dynamic Subgraphs

Pattern Mining in Frequent Dynamic Subgraphs Pattern Mining in Frequent Dynamic Subgraphs Karsten M. Borgwardt, Hans-Peter Kriegel, Peter Wackersreuther Institute of Computer Science Ludwig-Maximilians-Universität Munich, Germany kb kriegel wackersr@dbs.ifi.lmu.de

More information

Selecting Models from Videos for Appearance-Based Face Recognition

Selecting Models from Videos for Appearance-Based Face Recognition Selecting Models from Videos for Appearance-Based Face Recognition Abdenour Hadid and Matti Pietikäinen Machine Vision Group Infotech Oulu and Department of Electrical and Information Engineering P.O.

More information

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06 Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,

More information

6. Lecture notes on matroid intersection

6. Lecture notes on matroid intersection Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans May 2, 2017 6. Lecture notes on matroid intersection One nice feature about matroids is that a simple greedy algorithm

More information

Unsupervised Learning : Clustering

Unsupervised Learning : Clustering Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex

More information

Image Classification Using Wavelet Coefficients in Low-pass Bands

Image Classification Using Wavelet Coefficients in Low-pass Bands Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, August -7, 007 Image Classification Using Wavelet Coefficients in Low-pass Bands Weibao Zou, Member, IEEE, and Yan

More information

Locality Preserving Projections (LPP) Abstract

Locality Preserving Projections (LPP) Abstract Locality Preserving Projections (LPP) Xiaofei He Partha Niyogi Computer Science Department Computer Science Department The University of Chicago The University of Chicago Chicago, IL 60615 Chicago, IL

More information

Monotone Paths in Geometric Triangulations

Monotone Paths in Geometric Triangulations Monotone Paths in Geometric Triangulations Adrian Dumitrescu Ritankar Mandal Csaba D. Tóth November 19, 2017 Abstract (I) We prove that the (maximum) number of monotone paths in a geometric triangulation

More information

Rigidity, connectivity and graph decompositions

Rigidity, connectivity and graph decompositions First Prev Next Last Rigidity, connectivity and graph decompositions Brigitte Servatius Herman Servatius Worcester Polytechnic Institute Page 1 of 100 First Prev Next Last Page 2 of 100 We say that a framework

More information

Texture Image Segmentation using FCM

Texture Image Segmentation using FCM Proceedings of 2012 4th International Conference on Machine Learning and Computing IPCSIT vol. 25 (2012) (2012) IACSIT Press, Singapore Texture Image Segmentation using FCM Kanchan S. Deshmukh + M.G.M

More information

One-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所

One-class Problems and Outlier Detection. 陶卿 中国科学院自动化研究所 One-class Problems and Outlier Detection 陶卿 Qing.tao@mail.ia.ac.cn 中国科学院自动化研究所 Application-driven Various kinds of detection problems: unexpected conditions in engineering; abnormalities in medical data,

More information

Approximate Graph Edit Distance Guided by Bipartite Matching of Bags of Walks

Approximate Graph Edit Distance Guided by Bipartite Matching of Bags of Walks Approximate Graph Edit Distance Guided by Bipartite Matching of Bags of Walks Benoit Gaüzère 1, Sébastien Bougleux 2, Kaspar Riesen 3, and Luc Brun 1 1 ENSICAEN, GREYC CNRS UMR 6072, France {benoit.gauzere,luc.brun}@ensicaen.fr

More information

Training Digital Circuits with Hamming Clustering

Training Digital Circuits with Hamming Clustering IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 47, NO. 4, APRIL 2000 513 Training Digital Circuits with Hamming Clustering Marco Muselli, Member, IEEE, and Diego

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

Preimages of Small Geometric Cycles

Preimages of Small Geometric Cycles Preimages of Small Geometric Cycles Sally Cockburn Department of Mathematics Hamilton College, Clinton, NY scockbur@hamilton.edu Abstract A graph G is a homomorphic preimage of another graph H, or equivalently

More information

Unlabeled equivalence for matroids representable over finite fields

Unlabeled equivalence for matroids representable over finite fields Unlabeled equivalence for matroids representable over finite fields November 16, 2012 S. R. Kingan Department of Mathematics Brooklyn College, City University of New York 2900 Bedford Avenue Brooklyn,

More information

Loopy Belief Propagation

Loopy Belief Propagation Loopy Belief Propagation Research Exam Kristin Branson September 29, 2003 Loopy Belief Propagation p.1/73 Problem Formalization Reasoning about any real-world problem requires assumptions about the structure

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

Topology-Invariant Similarity and Diffusion Geometry

Topology-Invariant Similarity and Diffusion Geometry 1 Topology-Invariant Similarity and Diffusion Geometry Lecture 7 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 Intrinsic

More information

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

Contents. Preface to the Second Edition

Contents. Preface to the Second Edition Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................

More information

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017 Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of

More information

IMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING

IMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING SECOND EDITION IMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING ith Algorithms for ENVI/IDL Morton J. Canty с*' Q\ CRC Press Taylor &. Francis Group Boca Raton London New York CRC

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

A Relational View of Subgraph Isomorphism

A Relational View of Subgraph Isomorphism A Relational View of Subgraph Isomorphism J. Cortadella and G. Valiente Department of Software, Technical University of Catalonia, Barcelona, Spain Abstract. This paper presents a novel approach to the

More information

Hierarchical Multi level Approach to graph clustering

Hierarchical Multi level Approach to graph clustering Hierarchical Multi level Approach to graph clustering by: Neda Shahidi neda@cs.utexas.edu Cesar mantilla, cesar.mantilla@mail.utexas.edu Advisor: Dr. Inderjit Dhillon Introduction Data sets can be presented

More information

A CSP Search Algorithm with Reduced Branching Factor

A CSP Search Algorithm with Reduced Branching Factor A CSP Search Algorithm with Reduced Branching Factor Igor Razgon and Amnon Meisels Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, 84-105, Israel {irazgon,am}@cs.bgu.ac.il

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION 1.1 Advance Encryption Standard (AES) Rijndael algorithm is symmetric block cipher that can process data blocks of 128 bits, using cipher keys with lengths of 128, 192, and 256

More information

Graph Similarity for Data Mining Lot 6 of Data Analytics FY17/18

Graph Similarity for Data Mining Lot 6 of Data Analytics FY17/18 Graph Similarity for Data Mining Lot 6 of Data Analytics FY17/18 Meeting Presentation Peter Rodgers Algorithms for the Comparison of Graphs Graph Isomorphism Graph Edit Distance Various Others Used in

More information

Wavelet Applications. Texture analysis&synthesis. Gloria Menegaz 1

Wavelet Applications. Texture analysis&synthesis. Gloria Menegaz 1 Wavelet Applications Texture analysis&synthesis Gloria Menegaz 1 Wavelet based IP Compression and Coding The good approximation properties of wavelets allow to represent reasonably smooth signals with

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Kernels and representation

Kernels and representation Kernels and representation Corso di AA, anno 2017/18, Padova Fabio Aiolli 20 Dicembre 2017 Fabio Aiolli Kernels and representation 20 Dicembre 2017 1 / 19 (Hierarchical) Representation Learning Hierarchical

More information

A Comparative study of Clustering Algorithms using MapReduce in Hadoop

A Comparative study of Clustering Algorithms using MapReduce in Hadoop A Comparative study of Clustering Algorithms using MapReduce in Hadoop Dweepna Garg 1, Khushboo Trivedi 2, B.B.Panchal 3 1 Department of Computer Science and Engineering, Parul Institute of Engineering

More information

Latest development in image feature representation and extraction

Latest development in image feature representation and extraction International Journal of Advanced Research and Development ISSN: 2455-4030, Impact Factor: RJIF 5.24 www.advancedjournal.com Volume 2; Issue 1; January 2017; Page No. 05-09 Latest development in image

More information

Search Engines. Information Retrieval in Practice

Search Engines. Information Retrieval in Practice Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Classification and Clustering Classification and clustering are classical pattern recognition / machine learning problems

More information

Lecture 2 September 3

Lecture 2 September 3 EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give

More information