Bipartite Edge Prediction via Transductive Learning over Product Graphs
|
|
- Cornelius Craig
- 5 years ago
- Views:
Transcription
1 Bipartite Edge Prediction via Transductive Learning over Product Graphs Hanxiao Liu, Yiming Yang School of Computer Science, Carnegie Mellon University July 8, 2015 ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 1
2 Problem Description Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 2
3 Problem Description Problem Description Many applications involve predicting the edges of a bipartite graph. I II A B C 1 Recommender System 2 Host-Pathogen Interaction 3 Question-Answering Mapping 4 Citation Network...
4 Problem Description Problem Description Many applications involve predicting the edges of a bipartite graph. Graph G I II A B C Graph H 1 Recommender System 2 Host-Pathogen Interaction 3 Question-Answering Mapping 4 Citation Network... Sometimes, vertex sets on both sides are intrinsically structured. ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 4
5 Problem Description Problem Description Many applications involve predicting the edges of a bipartite graph. Graph G I II A B C Graph H 1 Recommender System 2 Host-Pathogen Interaction 3 Question-Answering Mapping 4 Citation Network... Sometimes, vertex sets on both sides are intrinsically structured. Heterogeneous info: G + H + partial observations ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 5
6 Problem Description Problem Description Many applications involve predicting the edges of a bipartite graph. Graph G I II A B C Graph H 1 Recommender System 2 Host-Pathogen Interaction 3 Question-Answering Mapping 4 Citation Network... Sometimes, vertex sets on both sides are intrinsically structured. Heterogeneous info: G + H + partial observations Combine them to make better edge predictions ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 6
7 The Proposed Framework The Proposed Framework A I -2 Graph G II +5 B Graph H C Transductive learning should be effective 1 Labeled edges (red) are highly sparse 2 Unlabeled edges (gray) are massively available ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 7
8 The Proposed Framework The Proposed Framework A I -2 Graph G II +5 B Graph H C Transductive learning should be effective 1 Labeled edges (red) are highly sparse 2 Unlabeled edges (gray) are massively available Assumption: similar edges should have similar labels ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 8
9 The Proposed Framework The Proposed Framework A I -2 Graph G II +5 B Graph H C Transductive learning should be effective 1 Labeled edges (red) are highly sparse 2 Unlabeled edges (gray) are massively available Assumption: similar edges should have similar labels Prerequisite: a similarity measure among the edges, i.e. a Graph of Edges (not directly provided) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 9
10 The Proposed Framework The Proposed Framework A I -2 Graph G II +5 B Graph H C Transductive learning should be effective 1 Labeled edges (red) are highly sparse 2 Unlabeled edges (gray) are massively available Assumption: similar edges should have similar labels Prerequisite: a similarity measure among the edges, i.e. a Graph of Edges (not directly provided) Can be induced from G and H via Graph Product! ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 10
11 The Proposed Framework The Proposed Framework The Graph of Edges can be induced by taking the product of G and H In the product graph G H Each Vertex edge (in the original bipartite graph) Each Edge edge-edge similarity ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 11
12 The Proposed Framework The Proposed Framework The Graph of Edges can be induced by taking the product of G and H In the product graph G H Each Vertex edge (in the original bipartite graph) Each Edge edge-edge similarity The adjacency matrix of the product graph is defined by (to be discussed later). ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 12
13 The Proposed Framework The Proposed Framework Problem Mapping Edge Prediction (Original Problem) Given G, H and labeled edges, predict the unlabeled edges Vertex Prediction (Equivalent Problem) Given G H and labeled vertices, predict the unlabeled vertices I -2 A (I, C) -2 (I, A) (I, B) II +5 B C (II, C) (II, A) +5 (II, B) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 13
14 Formulation Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 14
15 Formulation Product Graph Construction Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 15
16 Formulation Product Graph Construction Product Graph Construction Q: When should vertex (i, j) (i, j ) in the product graph Tensor GP i i in G AND j j in H Cartesian GP ( i i in G AND j = j ) OR ( i = i AND j j in H ) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 16
17 Formulation Product Graph Construction Product Graph Construction Q: When should vertex (i, j) (i, j ) in the product graph Tensor GP i i in G AND j j in H Cartesian GP ( i i in G AND j = j ) OR ( i = i AND j j in H ) Can be trivially generalized to weighted graphs. ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 17
18 Formulation Product Graph Construction Product Graph Construction Q: When should vertex (i, j) (i, j ) in the product graph Tensor GP i i in G AND j j in H Cartesian GP ( i i in G AND j = j ) OR ( i = i AND j j in H ) Can be trivially generalized to weighted graphs. To compute the adjacency matrices of PG G T ensor H = G H }{{} Kronecker (a.k.a. Tensor) Product G Cartesian H = G I + I H = G H }{{} Kronecker Sum ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 18
19 Formulation Product Graph Construction Product Graph Construction Both GPs can be written in the form of spectral decomposition G T ensor H = i,j (λ i µ j )(u i v j )(u i v j ) (1) G Cartesian H = i,j (λ i + µ j )(u i v j )(u i v j ) (2) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 19
20 Formulation Product Graph Construction Product Graph Construction Both GPs can be written in the form of spectral decomposition G T ensor H = i,j G Cartesian H = i,j (λ i µ j )(u i v j )(u i v j ) (1) }{{} soft AND (λ i + µ j )(u i v j )(u i v j ) (2) }{{} soft OR The interplay of graphs is captured by the interplay of their spectrum! ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 20
21 Formulation Product Graph Construction Product Graph Construction Both GPs can be written in the form of spectral decomposition G T ensor H = i,j G Cartesian H = i,j (λ i µ j )(u i v j )(u i v j ) (1) }{{} soft AND (λ i + µ j )(u i v j )(u i v j ) (2) }{{} soft OR The interplay of graphs is captured by the interplay of their spectrum! Generalization: Spectral Graph Product G H def = (λ i µ j )(u i v j )(u i v j ) (3) i,j where can be arbitrary binary operator (, +,... ) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 21
22 Formulation Product Graph Construction Product Graph Construction Both GPs can be written in the form of spectral decomposition G T ensor H = i,j G Cartesian H = i,j (λ i µ j )(u i v j )(u i v j ) (1) }{{} soft AND (λ i + µ j )(u i v j )(u i v j ) (2) }{{} soft OR The interplay of graphs is captured by the interplay of their spectrum! Generalization: Spectral Graph Product G H def = (λ i µ j )(u i v j )(u i v j ) (3) i,j where can be arbitrary binary operator (, +,... ) Commutative Property: G H and H G are isomorphic. ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 22
23 Formulation Graph-based Transductive Learning Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 23
24 Formulation Graph-based Transductive Learning Graph-based Transductive Learning With the product graph A def = G H constructed, we solve a standard graph-based transductive learning problem over A ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 24
25 Formulation Graph-based Transductive Learning Graph-based Transductive Learning With the product graph A def = G H constructed, we solve a standard graph-based transductive learning problem over A Learning Objective min f l(f) }{{} + λf A 1 f }{{} Loss Function Graph Regularization (4) f i system-predicted value for vertex i in A l(f) quantifies the gap between f and partially observed labels. λf A 1 f quantifies the smoothness over graph Underlying assumption: f N (0, A) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 25
26 Formulation Graph-based Transductive Learning Graph-based Transductive Learning The enhanced learning objective min f l(f) }{{} + λf κ(a) 1 f }{{} Loss Function Graph Regularization (5) to incorporate a variety of graph transduction patterns: ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 26
27 Formulation Graph-based Transductive Learning Graph-based Transductive Learning The enhanced learning objective min f l(f) }{{} + λf κ(a) 1 f }{{} Loss Function Graph Regularization (5) to incorporate a variety of graph transduction patterns: k-step Random Walk κ(a) = A k Regularized Laplacian κ(a) = (ɛi A) 1 = I + A + A 2 + A Diffusion Process κ(a) = exp(a) I + A + 1 2! A ! A3 + ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 27
28 Formulation Graph-based Transductive Learning Graph-based Transductive Learning The enhanced learning objective min f l(f) }{{} + λf κ(a) 1 f }{{} Loss Function Graph Regularization (5) to incorporate a variety of graph transduction patterns: k-step Random Walk κ(a) = A k Regularized Laplacian κ(a) = (ɛi A) 1 = I + A + A 2 + A Diffusion Process κ(a) = exp(a) I + A + 1 2! A ! A3 + All can be viewed as to transform the spectrum of A := i θ iu i u i A k = i θ k i u i u i (ɛi A) 1 = i 1 ɛ θ i u i u i exp(a) = i e θi u i u i ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 28
29 Optimization Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 29
30 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 30
31 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) Challenge: κ(a) = κ( G }{{} m m H }{{} ) is a huge mn mn matrix! n n ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 31
32 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) Challenge: κ(a) = κ( }{{} G }{{} H ) is a huge mn mn matrix! m m n n Prohibitive to load it into memory Prohibitive to compute its inverse Even if κ(a) 1 is given, it is expensive to compute r(f) naively ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 32
33 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) Challenge: κ(a) = κ( }{{} G }{{} H ) is a huge mn mn matrix! m m n n Prohibitive to load it into memory No need to store κ(a) Prohibitive to compute its inverse Even if κ(a) 1 is given, it is expensive to compute r(f) naively ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 33
34 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) Challenge: κ(a) = κ( }{{} G }{{} H ) is a huge mn mn matrix! m m n n Prohibitive to load it into memory No need to store κ(a) Prohibitive to compute its inverse No need of matrix inverse Even if κ(a) 1 is given, it is expensive to compute r(f) naively ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 34
35 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) Challenge: κ(a) = κ( }{{} G }{{} H ) is a huge mn mn matrix! m m n n Prohibitive to load it into memory No need to store κ(a) Prohibitive to compute its inverse No need of matrix inverse Even if κ(a) 1 is given, it is expensive to compute r(f) naively Can be performed much more efficiently ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 35
36 Optimization Optimization Keys for complexity reduction 1 Instead of matrices κ only manipulates eigenvalues only manipulates the interplay of eigenvalues ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 36
37 Optimization Optimization Keys for complexity reduction 1 Instead of matrices κ only manipulates eigenvalues only manipulates the interplay of eigenvalues 2 The vec trick: Bottleneck: multiplication (X Y )f ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 37
38 Optimization Optimization Keys for complexity reduction 1 Instead of matrices κ only manipulates eigenvalues only manipulates the interplay of eigenvalues 2 The vec trick: Bottleneck: multiplication (X Y )f f = vec(f ), where F ij def = system-predicted score for edge (i, j) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 38
39 Optimization Optimization Keys for complexity reduction 1 Instead of matrices κ only manipulates eigenvalues only manipulates the interplay of eigenvalues 2 The vec trick: Bottleneck: multiplication (X Y )f f = vec(f ), where F ij def = system-predicted score for edge (i, j) (X Y )f = (X Y )vec(f ) }{{} O(m 2 n 2 ) time/space vec(xf Y ) }{{} O(mn(m + n)) time, O((m + n) 2 ) space (7) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 39
40 Optimization Optimization with Low-rank Constraint Further speedup is possible by factorizing F into two low-rank matrices ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 40
41 Optimization Optimization with Low-rank Constraint Further speedup is possible by factorizing F into two low-rank matrices The cost of each alternating gradient step is proportional to rank(f ) rank(σ) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 41
42 Optimization Optimization with Low-rank Constraint Further speedup is possible by factorizing F into two low-rank matrices The cost of each alternating gradient step is proportional to rank(f ) rank(σ) Σ: a Characteristic Matrix where Σ ij = 1 κ(λ i µ j) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 42
43 Optimization Optimization with Low-rank Constraint Further speedup is possible by factorizing F into two low-rank matrices The cost of each alternating gradient step is proportional to rank(f ) rank(σ) Σ: a Characteristic Matrix where Σ ij = 1 κ(λ i µ j) An interesting observation: rank(σ) is usually a small constant! ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 43
44 Optimization Optimization with Low-rank Constraint Further speedup is possible by factorizing F into two low-rank matrices The cost of each alternating gradient step is proportional to rank(f ) rank(σ) 1 Σ: a Characteristic Matrix where Σ ij = κ(λ i µ j) An interesting observation: rank(σ) is usually a small constant! Example: Diffusion process over the Cartesian PG Σ = e (λ 1+µ 1 ). e (λm+µ 1) = rank(σ) = 1... e (λ 1+µ n) e (λm+µn) = e λ 1. e λm [ e µ 1... e µn] ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 44
45 Experiment Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 45
46 Experiment Datasets and Baselines Datasets Baselines Dataset G H Movielens-100K Users Movies Cora Publications Publications Courses Courses Prerequisite Courses MC Matrix Completion. Ignores the info of G and H. TK Tensor Kernel. Implicitly construct PG, no transduction GRMC Graph Regularized Matrix Completion. Transduction over G and H, no PG constructed ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 46
47 Experiment Results Performance of several interesting combinations of and κ Dataset Graph Transduction Graph Product MAP AUC Courses Cora MovieLens Random Walk Tensor Diffusion Cartesian von-neumann Tensor von-neumann Cartesian Sigmoid Cartesian Random Walk Tensor Diffusion Cartesian von-neumann Tensor von-neumann Cartesian Sigmoid Cartesian Random Walk Tensor Diffusion Cartesian von-neumann Tensor von-neumann Cartesian Sigmoid Cartesian ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 47
48 Experiment Results Proposed method (Diff + Cartesian GP) v.s. Baselines Dataset Method MAP AUC ndcg@3 Courses Cora MovieLens MC GRMC TK Proposed MC GRMC TK Proposed MC GRMC TK Proposed ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 48
49 Conclusion Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 49
50 Conclusion Conclusion Summary Problem Predicting the missing edges of a bipartite graph with graph-structured vertex sets on both sides. Contribution A novel approach via transductive learning over product graph, efficient algorithmic solution and good results. ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 50
51 Conclusion Conclusion Summary Problem Predicting the missing edges of a bipartite graph with graph-structured vertex sets on both sides. Contribution A novel approach via transductive learning over product graph, efficient algorithmic solution and good results. On-going Work Extend to k Graphs (k > 2) Bipartite Graph k-partite Graph Edge Hyperedge Determine the optimal graph product for any given problem. ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 51
52 Conclusion Thanks! ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 52
Efficient Iterative Semi-supervised Classification on Manifold
. Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani
More informationVisual Representations for Machine Learning
Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering
More informationA Taxonomy of Semi-Supervised Learning Algorithms
A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph
More informationOptimization for Machine Learning
with a focus on proximal gradient descent algorithm Department of Computer Science and Engineering Outline 1 History & Trends 2 Proximal Gradient Descent 3 Three Applications A Brief History A. Convex
More informationFeature Selection for fmri Classification
Feature Selection for fmri Classification Chuang Wu Program of Computational Biology Carnegie Mellon University Pittsburgh, PA 15213 chuangw@andrew.cmu.edu Abstract The functional Magnetic Resonance Imaging
More informationBig Data Analytics. Special Topics for Computer Science CSE CSE Feb 11
Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering II Spectral
More informationAarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg
Spectral Clustering Aarti Singh Machine Learning 10-701/15-781 Apr 7, 2010 Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg 1 Data Clustering Graph Clustering Goal: Given data points X1,, Xn and similarities
More informationIntroduction to spectral clustering
Introduction to spectral clustering Vasileios Zografos zografos@isy.liu.se Klas Nordberg klas@isy.liu.se What this course is Basic introduction into the core ideas of spectral clustering Sufficient to
More informationDivide and Conquer Kernel Ridge Regression
Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem
More informationKernels for Structured Data
T-122.102 Special Course in Information Science VI: Co-occurence methods in analysis of discrete data Kernels for Structured Data Based on article: A Survey of Kernels for Structured Data by Thomas Gärtner
More informationMy favorite application using eigenvalues: partitioning and community detection in social networks
My favorite application using eigenvalues: partitioning and community detection in social networks Will Hobbs February 17, 2013 Abstract Social networks are often organized into families, friendship groups,
More informationCSE 250B Assignment 2 Report
CSE 250B Assignment 2 Report February 16, 2012 Yuncong Chen yuncong@cs.ucsd.edu Pengfei Chen pec008@ucsd.edu Yang Liu yal060@cs.ucsd.edu Abstract In this report we describe our implementation of a conditional
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank
More informationClustering: Classic Methods and Modern Views
Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering
More information55:148 Digital Image Processing Chapter 11 3D Vision, Geometry
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography Estimating homography from point correspondence
More informationLarge-Scale Face Manifold Learning
Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random
More informationPERSONALIZED TAG RECOMMENDATION
PERSONALIZED TAG RECOMMENDATION Ziyu Guan, Xiaofei He, Jiajun Bu, Qiaozhu Mei, Chun Chen, Can Wang Zhejiang University, China Univ. of Illinois/Univ. of Michigan 1 Booming of Social Tagging Applications
More informationOnline Social Networks and Media
Online Social Networks and Media Absorbing Random Walks Link Prediction Why does the Power Method work? If a matrix R is real and symmetric, it has real eigenvalues and eigenvectors: λ, w, λ 2, w 2,, (λ
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford
More informationNetwork Data Sampling and Estimation
Network Data Sampling and Estimation Hui Yang and Yanan Jia September 25, 2014 Yang, Jia September 25, 2014 1 / 33 Outline 1 Introduction 2 Network Sampling Designs Induced and Incident Subgraph Sampling
More informationGraph Theory S 1 I 2 I 1 S 2 I 1 I 2
Graph Theory S I I S S I I S Graphs Definition A graph G is a pair consisting of a vertex set V (G), and an edge set E(G) ( ) V (G). x and y are the endpoints of edge e = {x, y}. They are called adjacent
More informationMath 778S Spectral Graph Theory Handout #2: Basic graph theory
Math 778S Spectral Graph Theory Handout #: Basic graph theory Graph theory was founded by the great Swiss mathematician Leonhard Euler (1707-178) after he solved the Königsberg Bridge problem: Is it possible
More informationLecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM)
School of Computer Science Probabilistic Graphical Models Structured Sparse Additive Models Junming Yin and Eric Xing Lecture 7, April 4, 013 Reading: See class website 1 Outline Nonparametric regression
More informationHeat Kernels and Diffusion Processes
Heat Kernels and Diffusion Processes Definition: University of Alicante (Spain) Matrix Computing (subject 3168 Degree in Maths) 30 hours (theory)) + 15 hours (practical assignment) Contents 1. Solving
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationOutline of this Talk
Outline of this Talk Data Association associate common detections across frames matching up who is who Two frames: linear assignment problem Generalize to three or more frames increasing solution quality
More informationUsing PageRank in Feature Selection
Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy fienco,meo,bottag@di.unito.it Abstract. Feature selection is an important
More informationCross-Domain Kernel Induction for Transfer Learning
Cross-Domain Kernel Induction for Transfer Learning Wei-Cheng Chang wchang2@andrew.cmu.edu Yuexin Wu yuexinw@andrew.cmu.edu Hanxiao Liu hanxiaol@cs.cmu.edu Yiming Yang yiming@cs.cmu.edu Abstract The key
More informationConvex Optimization / Homework 2, due Oct 3
Convex Optimization 0-725/36-725 Homework 2, due Oct 3 Instructions: You must complete Problems 3 and either Problem 4 or Problem 5 (your choice between the two) When you submit the homework, upload a
More informationLearning a classification of Mixed-Integer Quadratic Programming problems
Learning a classification of Mixed-Integer Quadratic Programming problems CERMICS 2018 June 29, 2018, Fréjus Pierre Bonami 1, Andrea Lodi 2, Giulia Zarpellon 2 1 CPLEX Optimization, IBM Spain 2 Polytechnique
More informationThorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA
Retrospective ICML99 Transductive Inference for Text Classification using Support Vector Machines Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA Outline The paper in
More information3D Geometry and Camera Calibration
3D Geometry and Camera Calibration 3D Coordinate Systems Right-handed vs. left-handed x x y z z y 2D Coordinate Systems 3D Geometry Basics y axis up vs. y axis down Origin at center vs. corner Will often
More informationImprovements in Dynamic Partitioning. Aman Arora Snehal Chitnavis
Improvements in Dynamic Partitioning Aman Arora Snehal Chitnavis Introduction Partitioning - Decomposition & Assignment Break up computation into maximum number of small concurrent computations that can
More informationNormalized cuts and image segmentation
Normalized cuts and image segmentation Department of EE University of Washington Yeping Su Xiaodan Song Normalized Cuts and Image Segmentation, IEEE Trans. PAMI, August 2000 5/20/2003 1 Outline 1. Image
More informationTargil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation
Targil : Image Segmentation Image segmentation Many slides from Steve Seitz Segment region of the image which: elongs to a single object. Looks uniform (gray levels, color ) Have the same attributes (texture
More informationParallel and Distributed Sparse Optimization Algorithms
Parallel and Distributed Sparse Optimization Algorithms Part I Ruoyu Li 1 1 Department of Computer Science and Engineering University of Texas at Arlington March 19, 2015 Ruoyu Li (UTA) Parallel and Distributed
More informationGraph based machine learning with applications to media analytics
Graph based machine learning with applications to media analytics Lei Ding, PhD 9-1-2011 with collaborators at Outline Graph based machine learning Basic structures Algorithms Examples Applications in
More informationSegmentation: Clustering, Graph Cut and EM
Segmentation: Clustering, Graph Cut and EM Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu
More informationSpectral Clustering on Handwritten Digits Database
October 6, 2015 Spectral Clustering on Handwritten Digits Database Danielle dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance
More informationSpectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014
Spectral Clustering Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 What are we going to talk about? Introduction Clustering and
More informationLecture 25: Element-wise Sampling of Graphs and Linear Equation Solving, Cont. 25 Element-wise Sampling of Graphs and Linear Equation Solving,
Stat260/CS294: Randomized Algorithms for Matrices and Data Lecture 25-12/04/2013 Lecture 25: Element-wise Sampling of Graphs and Linear Equation Solving, Cont. Lecturer: Michael Mahoney Scribe: Michael
More informationGraph-based Techniques for Searching Large-Scale Noisy Multimedia Data
Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Shih-Fu Chang Department of Electrical Engineering Department of Computer Science Columbia University Joint work with Jun Wang (IBM),
More informationDistributed CUR Decomposition for Bi-Clustering
Distributed CUR Decomposition for Bi-Clustering Kevin Shaw, Stephen Kline {keshaw, sakline}@stanford.edu June 5, 2016 Stanford University, CME 323 Final Project Abstract We describe a distributed algorithm
More informationGraph Theory Review. January 30, Network Science Analytics Graph Theory Review 1
Graph Theory Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ January 30, 2018 Network
More informationSpectral Clustering and Community Detection in Labeled Graphs
Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu
More informationGeneralized trace ratio optimization and applications
Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO
More informationData Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\
Data Preprocessing Javier Béjar BY: $\ URL - Spring 2018 C CS - MAI 1/78 Introduction Data representation Unstructured datasets: Examples described by a flat set of attributes: attribute-value matrix Structured
More informationTwo-graphs revisited. Peter J. Cameron University of St Andrews Modern Trends in Algebraic Graph Theory Villanova, June 2014
Two-graphs revisited Peter J. Cameron University of St Andrews Modern Trends in Algebraic Graph Theory Villanova, June 2014 History The icosahedron has six diagonals, any two making the same angle (arccos(1/
More informationExploiting Low-Rank Structure in Semidenite Programming by Approximate Operator Splitting
Exploiting Low-Rank Structure in Semidenite Programming by Approximate Operator Splitting Mario Souto, Joaquim D. Garcia and Álvaro Veiga June 26, 2018 Outline Introduction Algorithm Exploiting low-rank
More information10-701/15-781, Fall 2006, Final
-7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly
More informationCourse Introduction / Review of Fundamentals of Graph Theory
Course Introduction / Review of Fundamentals of Graph Theory Hiroki Sayama sayama@binghamton.edu Rise of Network Science (From Barabasi 2010) 2 Network models Many discrete parts involved Classic mean-field
More informationCS 6824: The Small World of the Cerebral Cortex
CS 6824: The Small World of the Cerebral Cortex T. M. Murali September 1, 2016 Motivation The Watts-Strogatz paper set off a storm of research. It has nearly 30,000 citations. Even in 2004, it had more
More informationDiffusion Wavelets for Natural Image Analysis
Diffusion Wavelets for Natural Image Analysis Tyrus Berry December 16, 2011 Contents 1 Project Description 2 2 Introduction to Diffusion Wavelets 2 2.1 Diffusion Multiresolution............................
More informationSampling Large Graphs for Anticipatory Analysis
Sampling Large Graphs for Anticipatory Analysis Lauren Edwards*, Luke Johnson, Maja Milosavljevic, Vijay Gadepally, Benjamin A. Miller IEEE High Performance Extreme Computing Conference September 16, 2015
More informationAn Introduction to Graph Theory
An Introduction to Graph Theory CIS008-2 Logic and Foundations of Mathematics David Goodwin david.goodwin@perisic.com 12:00, Friday 17 th February 2012 Outline 1 Graphs 2 Paths and cycles 3 Graphs and
More informationKernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:
More informationby conservation of flow, hence the cancelation. Similarly, we have
Chapter 13: Network Flows and Applications Network: directed graph with source S and target T. Non-negative edge weights represent capacities. Assume no edges into S or out of T. (If necessary, we can
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationSocial Network Analysis
Social Network Analysis Mathematics of Networks Manar Mohaisen Department of EEC Engineering Adjacency matrix Network types Edge list Adjacency list Graph representation 2 Adjacency matrix Adjacency matrix
More informationUsing PageRank in Feature Selection
Using PageRank in Feature Selection Dino Ienco, Rosa Meo, and Marco Botta Dipartimento di Informatica, Università di Torino, Italy {ienco,meo,botta}@di.unito.it Abstract. Feature selection is an important
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form
More informationBlocking Optimization Strategies for Sparse Tensor Computation
Blocking Optimization Strategies for Sparse Tensor Computation Jee Choi 1, Xing Liu 1, Shaden Smith 2, and Tyler Simon 3 1 IBM T. J. Watson Research, 2 University of Minnesota, 3 University of Maryland
More informationIntroduction to Graph Theory
Introduction to Graph Theory Tandy Warnow January 20, 2017 Graphs Tandy Warnow Graphs A graph G = (V, E) is an object that contains a vertex set V and an edge set E. We also write V (G) to denote the vertex
More informationDynamically Motivated Models for Multiplex Networks 1
Introduction Dynamically Motivated Models for Multiplex Networks 1 Daryl DeFord Dartmouth College Department of Mathematics Santa Fe institute Inference on Networks: Algorithms, Phase Transitions, New
More informationGraphs and Network Flows IE411. Lecture 21. Dr. Ted Ralphs
Graphs and Network Flows IE411 Lecture 21 Dr. Ted Ralphs IE411 Lecture 21 1 Combinatorial Optimization and Network Flows In general, most combinatorial optimization and integer programming problems are
More informationClustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic
Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the
More informationExtracting Information from Complex Networks
Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform
More informationarxiv: v1 [cs.si] 6 Oct 2018
Higher-order Spectral Clustering for Heterogeneous Graphs Aldo G. Carranza Stanford University Anup Rao Adobe Research Ryan A. Rossi Adobe Research Eunyee Koh Adobe Research arxiv:80.02959v [cs.si] 6 Oct
More informationApplication of Spectral Clustering Algorithm
1/27 Application of Spectral Clustering Algorithm Danielle Middlebrooks dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance
More informationProblem Definition. Clustering nonlinearly separable data:
Outlines Weighted Graph Cuts without Eigenvectors: A Multilevel Approach (PAMI 2007) User-Guided Large Attributed Graph Clustering with Multiple Sparse Annotations (PAKDD 2016) Problem Definition Clustering
More informationVisual Tracking (1) Feature Point Tracking and Block Matching
Intelligent Control Systems Visual Tracking (1) Feature Point Tracking and Block Matching Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/
More informationLSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems
LSRN: A Parallel Iterative Solver for Strongly Over- or Under-Determined Systems Xiangrui Meng Joint with Michael A. Saunders and Michael W. Mahoney Stanford University June 19, 2012 Meng, Saunders, Mahoney
More informationProblem 1: Complexity of Update Rules for Logistic Regression
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1
More informationCS473-Algorithms I. Lecture 13-A. Graphs. Cevdet Aykanat - Bilkent University Computer Engineering Department
CS473-Algorithms I Lecture 3-A Graphs Graphs A directed graph (or digraph) G is a pair (V, E), where V is a finite set, and E is a binary relation on V The set V: Vertex set of G The set E: Edge set of
More informationMini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class
Mini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class Guidelines Submission. Submit a hardcopy of the report containing all the figures and printouts of code in class. For readability
More informationMachine learning - HT Clustering
Machine learning - HT 2016 10. Clustering Varun Kanade University of Oxford March 4, 2016 Announcements Practical Next Week - No submission Final Exam: Pick up on Monday Material covered next week is not
More informationLecture 11: Clustering and the Spectral Partitioning Algorithm A note on randomized algorithm, Unbiased estimates
CSE 51: Design and Analysis of Algorithms I Spring 016 Lecture 11: Clustering and the Spectral Partitioning Algorithm Lecturer: Shayan Oveis Gharan May nd Scribe: Yueqi Sheng Disclaimer: These notes have
More informationMOST machine learning algorithms rely on the assumption
1 Domain Adaptation on Graphs by Learning Aligned Graph Bases Mehmet Pilancı and Elif Vural arxiv:183.5288v1 [stat.ml] 14 Mar 218 Abstract We propose a method for domain adaptation on graphs. Given sufficiently
More informationData Preprocessing. Javier Béjar AMLT /2017 CS - MAI. (CS - MAI) Data Preprocessing AMLT / / 71 BY: $\
Data Preprocessing S - MAI AMLT - 2016/2017 (S - MAI) Data Preprocessing AMLT - 2016/2017 1 / 71 Outline 1 Introduction Data Representation 2 Data Preprocessing Outliers Missing Values Normalization Discretization
More informationSparse and large-scale learning with heterogeneous data
Sparse and large-scale learning with heterogeneous data February 15, 2007 Gert Lanckriet (gert@ece.ucsd.edu) IEEE-SDCIS In this talk Statistical machine learning Techniques: roots in classical statistics
More informationTRANSDUCTIVE LINK SPAM DETECTION
TRANSDUCTIVE LINK SPAM DETECTION Denny Zhou Microsoft Research http://research.microsoft.com/~denzho Joint work with Chris Burges and Tao Tao Presenter: Krysta Svore Link spam detection problem Classification
More informationTopology-Invariant Similarity and Diffusion Geometry
1 Topology-Invariant Similarity and Diffusion Geometry Lecture 7 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 Intrinsic
More informationAn Approximate Singular Value Decomposition of Large Matrices in Julia
An Approximate Singular Value Decomposition of Large Matrices in Julia Alexander J. Turner 1, 1 Harvard University, School of Engineering and Applied Sciences, Cambridge, MA, USA. In this project, I implement
More informationBehavioral Data Mining. Lecture 10 Kernel methods and SVMs
Behavioral Data Mining Lecture 10 Kernel methods and SVMs Outline SVMs as large-margin linear classifiers Kernel methods SVM algorithms SVMs as large-margin classifiers margin The separating plane maximizes
More informationModelling and implementation of algorithms in applied mathematics using MPI
Modelling and implementation of algorithms in applied mathematics using MPI Lecture 1: Basics of Parallel Computing G. Rapin Brazil March 2011 Outline 1 Structure of Lecture 2 Introduction 3 Parallel Performance
More informationKernel spectral clustering: model representations, sparsity and out-of-sample extensions
Kernel spectral clustering: model representations, sparsity and out-of-sample extensions Johan Suykens and Carlos Alzate K.U. Leuven, ESAT-SCD/SISTA Kasteelpark Arenberg B-3 Leuven (Heverlee), Belgium
More informationGPUML: Graphical processors for speeding up kernel machines
GPUML: Graphical processors for speeding up kernel machines http://www.umiacs.umd.edu/~balajiv/gpuml.htm Balaji Vasan Srinivasan, Qi Hu, Ramani Duraiswami Department of Computer Science, University of
More informationHashing with Graphs. Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011
Hashing with Graphs Wei Liu (Columbia Columbia), Jun Wang (IBM IBM), Sanjiv Kumar (Google), and Shih Fu Chang (Columbia) June, 2011 Overview Graph Hashing Outline Anchor Graph Hashing Experiments Conclusions
More informationComplex-Network Modelling and Inference
Complex-Network Modelling and Inference Lecture 8: Graph features (2) Matthew Roughan http://www.maths.adelaide.edu.au/matthew.roughan/notes/ Network_Modelling/ School
More informationSpectral Methods for Network Community Detection and Graph Partitioning
Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection
More informationMSA220 - Statistical Learning for Big Data
MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups
More informationGraphs. Pseudograph: multiple edges and loops allowed
Graphs G = (V, E) V - set of vertices, E - set of edges Undirected graphs Simple graph: V - nonempty set of vertices, E - set of unordered pairs of distinct vertices (no multiple edges or loops) Multigraph:
More informationSlides based on those in:
Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org A 3.3 B 38.4 C 34.3 D 3.9 E 8.1 F 3.9 1.6 1.6 1.6 1.6 1.6 2 y 0.8 ½+0.2 ⅓ M 1/2 1/2 0 0.8 1/2 0 0 + 0.2 0 1/2 1 [1/N]
More information3D Geometry and Camera Calibration
3D Geometr and Camera Calibration 3D Coordinate Sstems Right-handed vs. left-handed 2D Coordinate Sstems ais up vs. ais down Origin at center vs. corner Will often write (u, v) for image coordinates v
More informationBehavioral Data Mining. Lecture 18 Clustering
Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i
More informationScalable Clustering of Signed Networks Using Balance Normalized Cut
Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.
More informationThe Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem
Int. J. Advance Soft Compu. Appl, Vol. 9, No. 1, March 2017 ISSN 2074-8523 The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Loc Tran 1 and Linh Tran
More informationHigher-Order Clustering for Heterogeneous Networks via Typed Motifs
Aldo G. Carranza Stanford University Anup Rao Adobe Research Ryan A. Rossi Adobe Research Eunyee Koh Adobe Research ABSTRACT Higher-order connectivity patterns such as small induced subgraphs called graphlets
More informationV4 Matrix algorithms and graph partitioning
V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community
More informationMore Data, Less Work: Runtime as a decreasing function of data set size. Nati Srebro. Toyota Technological Institute Chicago
More Data, Less Work: Runtime as a decreasing function of data set size Nati Srebro Toyota Technological Institute Chicago Outline we are here SVM speculations, other problems Clustering wild speculations,
More information