Bipartite Edge Prediction via Transductive Learning over Product Graphs

Size: px

Start display at page:

Download "Bipartite Edge Prediction via Transductive Learning over Product Graphs"

Cornelius Craig
5 years ago
Views:

1 Bipartite Edge Prediction via Transductive Learning over Product Graphs Hanxiao Liu, Yiming Yang School of Computer Science, Carnegie Mellon University July 8, 2015 ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 1

2 Problem Description Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 2

3 Problem Description Problem Description Many applications involve predicting the edges of a bipartite graph. I II A B C 1 Recommender System 2 Host-Pathogen Interaction 3 Question-Answering Mapping 4 Citation Network...

4 Problem Description Problem Description Many applications involve predicting the edges of a bipartite graph. Graph G I II A B C Graph H 1 Recommender System 2 Host-Pathogen Interaction 3 Question-Answering Mapping 4 Citation Network... Sometimes, vertex sets on both sides are intrinsically structured. ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 4

5 Problem Description Problem Description Many applications involve predicting the edges of a bipartite graph. Graph G I II A B C Graph H 1 Recommender System 2 Host-Pathogen Interaction 3 Question-Answering Mapping 4 Citation Network... Sometimes, vertex sets on both sides are intrinsically structured. Heterogeneous info: G + H + partial observations ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 5

6 Problem Description Problem Description Many applications involve predicting the edges of a bipartite graph. Graph G I II A B C Graph H 1 Recommender System 2 Host-Pathogen Interaction 3 Question-Answering Mapping 4 Citation Network... Sometimes, vertex sets on both sides are intrinsically structured. Heterogeneous info: G + H + partial observations Combine them to make better edge predictions ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 6

7 The Proposed Framework The Proposed Framework A I -2 Graph G II +5 B Graph H C Transductive learning should be effective 1 Labeled edges (red) are highly sparse 2 Unlabeled edges (gray) are massively available ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 7

8 The Proposed Framework The Proposed Framework A I -2 Graph G II +5 B Graph H C Transductive learning should be effective 1 Labeled edges (red) are highly sparse 2 Unlabeled edges (gray) are massively available Assumption: similar edges should have similar labels ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 8

9 The Proposed Framework The Proposed Framework A I -2 Graph G II +5 B Graph H C Transductive learning should be effective 1 Labeled edges (red) are highly sparse 2 Unlabeled edges (gray) are massively available Assumption: similar edges should have similar labels Prerequisite: a similarity measure among the edges, i.e. a Graph of Edges (not directly provided) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 9

10 The Proposed Framework The Proposed Framework A I -2 Graph G II +5 B Graph H C Transductive learning should be effective 1 Labeled edges (red) are highly sparse 2 Unlabeled edges (gray) are massively available Assumption: similar edges should have similar labels Prerequisite: a similarity measure among the edges, i.e. a Graph of Edges (not directly provided) Can be induced from G and H via Graph Product! ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 10

11 The Proposed Framework The Proposed Framework The Graph of Edges can be induced by taking the product of G and H In the product graph G H Each Vertex edge (in the original bipartite graph) Each Edge edge-edge similarity ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 11

12 The Proposed Framework The Proposed Framework The Graph of Edges can be induced by taking the product of G and H In the product graph G H Each Vertex edge (in the original bipartite graph) Each Edge edge-edge similarity The adjacency matrix of the product graph is defined by (to be discussed later). ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 12

13 The Proposed Framework The Proposed Framework Problem Mapping Edge Prediction (Original Problem) Given G, H and labeled edges, predict the unlabeled edges Vertex Prediction (Equivalent Problem) Given G H and labeled vertices, predict the unlabeled vertices I -2 A (I, C) -2 (I, A) (I, B) II +5 B C (II, C) (II, A) +5 (II, B) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 13

14 Formulation Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 14

15 Formulation Product Graph Construction Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 15

16 Formulation Product Graph Construction Product Graph Construction Q: When should vertex (i, j) (i, j ) in the product graph Tensor GP i i in G AND j j in H Cartesian GP ( i i in G AND j = j ) OR ( i = i AND j j in H ) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 16

17 Formulation Product Graph Construction Product Graph Construction Q: When should vertex (i, j) (i, j ) in the product graph Tensor GP i i in G AND j j in H Cartesian GP ( i i in G AND j = j ) OR ( i = i AND j j in H ) Can be trivially generalized to weighted graphs. ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 17

18 Formulation Product Graph Construction Product Graph Construction Q: When should vertex (i, j) (i, j ) in the product graph Tensor GP i i in G AND j j in H Cartesian GP ( i i in G AND j = j ) OR ( i = i AND j j in H ) Can be trivially generalized to weighted graphs. To compute the adjacency matrices of PG G T ensor H = G H }{{} Kronecker (a.k.a. Tensor) Product G Cartesian H = G I + I H = G H }{{} Kronecker Sum ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 18

19 Formulation Product Graph Construction Product Graph Construction Both GPs can be written in the form of spectral decomposition G T ensor H = i,j (λ i µ j )(u i v j )(u i v j ) (1) G Cartesian H = i,j (λ i + µ j )(u i v j )(u i v j ) (2) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 19

20 Formulation Product Graph Construction Product Graph Construction Both GPs can be written in the form of spectral decomposition G T ensor H = i,j G Cartesian H = i,j (λ i µ j )(u i v j )(u i v j ) (1) }{{} soft AND (λ i + µ j )(u i v j )(u i v j ) (2) }{{} soft OR The interplay of graphs is captured by the interplay of their spectrum! ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 20

21 Formulation Product Graph Construction Product Graph Construction Both GPs can be written in the form of spectral decomposition G T ensor H = i,j G Cartesian H = i,j (λ i µ j )(u i v j )(u i v j ) (1) }{{} soft AND (λ i + µ j )(u i v j )(u i v j ) (2) }{{} soft OR The interplay of graphs is captured by the interplay of their spectrum! Generalization: Spectral Graph Product G H def = (λ i µ j )(u i v j )(u i v j ) (3) i,j where can be arbitrary binary operator (, +,... ) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 21

22 Formulation Product Graph Construction Product Graph Construction Both GPs can be written in the form of spectral decomposition G T ensor H = i,j G Cartesian H = i,j (λ i µ j )(u i v j )(u i v j ) (1) }{{} soft AND (λ i + µ j )(u i v j )(u i v j ) (2) }{{} soft OR The interplay of graphs is captured by the interplay of their spectrum! Generalization: Spectral Graph Product G H def = (λ i µ j )(u i v j )(u i v j ) (3) i,j where can be arbitrary binary operator (, +,... ) Commutative Property: G H and H G are isomorphic. ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 22

23 Formulation Graph-based Transductive Learning Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 23

24 Formulation Graph-based Transductive Learning Graph-based Transductive Learning With the product graph A def = G H constructed, we solve a standard graph-based transductive learning problem over A ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 24

25 Formulation Graph-based Transductive Learning Graph-based Transductive Learning With the product graph A def = G H constructed, we solve a standard graph-based transductive learning problem over A Learning Objective min f l(f) }{{} + λf A 1 f }{{} Loss Function Graph Regularization (4) f i system-predicted value for vertex i in A l(f) quantifies the gap between f and partially observed labels. λf A 1 f quantifies the smoothness over graph Underlying assumption: f N (0, A) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 25

26 Formulation Graph-based Transductive Learning Graph-based Transductive Learning The enhanced learning objective min f l(f) }{{} + λf κ(a) 1 f }{{} Loss Function Graph Regularization (5) to incorporate a variety of graph transduction patterns: ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 26

27 Formulation Graph-based Transductive Learning Graph-based Transductive Learning The enhanced learning objective min f l(f) }{{} + λf κ(a) 1 f }{{} Loss Function Graph Regularization (5) to incorporate a variety of graph transduction patterns: k-step Random Walk κ(a) = A k Regularized Laplacian κ(a) = (ɛi A) 1 = I + A + A 2 + A Diffusion Process κ(a) = exp(a) I + A + 1 2! A ! A3 + ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 27

28 Formulation Graph-based Transductive Learning Graph-based Transductive Learning The enhanced learning objective min f l(f) }{{} + λf κ(a) 1 f }{{} Loss Function Graph Regularization (5) to incorporate a variety of graph transduction patterns: k-step Random Walk κ(a) = A k Regularized Laplacian κ(a) = (ɛi A) 1 = I + A + A 2 + A Diffusion Process κ(a) = exp(a) I + A + 1 2! A ! A3 + All can be viewed as to transform the spectrum of A := i θ iu i u i A k = i θ k i u i u i (ɛi A) 1 = i 1 ɛ θ i u i u i exp(a) = i e θi u i u i ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 28

29 Optimization Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 29

30 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 30

31 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) Challenge: κ(a) = κ( G }{{} m m H }{{} ) is a huge mn mn matrix! n n ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 31

32 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) Challenge: κ(a) = κ( }{{} G }{{} H ) is a huge mn mn matrix! m m n n Prohibitive to load it into memory Prohibitive to compute its inverse Even if κ(a) 1 is given, it is expensive to compute r(f) naively ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 32

33 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) Challenge: κ(a) = κ( }{{} G }{{} H ) is a huge mn mn matrix! m m n n Prohibitive to load it into memory No need to store κ(a) Prohibitive to compute its inverse Even if κ(a) 1 is given, it is expensive to compute r(f) naively ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 33

34 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) Challenge: κ(a) = κ( }{{} G }{{} H ) is a huge mn mn matrix! m m n n Prohibitive to load it into memory No need to store κ(a) Prohibitive to compute its inverse No need of matrix inverse Even if κ(a) 1 is given, it is expensive to compute r(f) naively ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 34

35 Optimization Optimization Transductive Learning over Product Graph min f l(f) + λ f κ(a) 1 f }{{} r(f) (6) Challenge: κ(a) = κ( }{{} G }{{} H ) is a huge mn mn matrix! m m n n Prohibitive to load it into memory No need to store κ(a) Prohibitive to compute its inverse No need of matrix inverse Even if κ(a) 1 is given, it is expensive to compute r(f) naively Can be performed much more efficiently ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 35

36 Optimization Optimization Keys for complexity reduction 1 Instead of matrices κ only manipulates eigenvalues only manipulates the interplay of eigenvalues ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 36

37 Optimization Optimization Keys for complexity reduction 1 Instead of matrices κ only manipulates eigenvalues only manipulates the interplay of eigenvalues 2 The vec trick: Bottleneck: multiplication (X Y )f ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 37

38 Optimization Optimization Keys for complexity reduction 1 Instead of matrices κ only manipulates eigenvalues only manipulates the interplay of eigenvalues 2 The vec trick: Bottleneck: multiplication (X Y )f f = vec(f ), where F ij def = system-predicted score for edge (i, j) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 38

39 Optimization Optimization Keys for complexity reduction 1 Instead of matrices κ only manipulates eigenvalues only manipulates the interplay of eigenvalues 2 The vec trick: Bottleneck: multiplication (X Y )f f = vec(f ), where F ij def = system-predicted score for edge (i, j) (X Y )f = (X Y )vec(f ) }{{} O(m 2 n 2 ) time/space vec(xf Y ) }{{} O(mn(m + n)) time, O((m + n) 2 ) space (7) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 39

40 Optimization Optimization with Low-rank Constraint Further speedup is possible by factorizing F into two low-rank matrices ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 40

41 Optimization Optimization with Low-rank Constraint Further speedup is possible by factorizing F into two low-rank matrices The cost of each alternating gradient step is proportional to rank(f ) rank(σ) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 41

42 Optimization Optimization with Low-rank Constraint Further speedup is possible by factorizing F into two low-rank matrices The cost of each alternating gradient step is proportional to rank(f ) rank(σ) Σ: a Characteristic Matrix where Σ ij = 1 κ(λ i µ j) ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 42

43 Optimization Optimization with Low-rank Constraint Further speedup is possible by factorizing F into two low-rank matrices The cost of each alternating gradient step is proportional to rank(f ) rank(σ) Σ: a Characteristic Matrix where Σ ij = 1 κ(λ i µ j) An interesting observation: rank(σ) is usually a small constant! ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 43

44 Optimization Optimization with Low-rank Constraint Further speedup is possible by factorizing F into two low-rank matrices The cost of each alternating gradient step is proportional to rank(f ) rank(σ) 1 Σ: a Characteristic Matrix where Σ ij = κ(λ i µ j) An interesting observation: rank(σ) is usually a small constant! Example: Diffusion process over the Cartesian PG Σ = e (λ 1+µ 1 ). e (λm+µ 1) = rank(σ) = 1... e (λ 1+µ n) e (λm+µn) = e λ 1. e λm [ e µ 1... e µn] ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 44

45 Experiment Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 45

46 Experiment Datasets and Baselines Datasets Baselines Dataset G H Movielens-100K Users Movies Cora Publications Publications Courses Courses Prerequisite Courses MC Matrix Completion. Ignores the info of G and H. TK Tensor Kernel. Implicitly construct PG, no transduction GRMC Graph Regularized Matrix Completion. Transduction over G and H, no PG constructed ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 46

47 Experiment Results Performance of several interesting combinations of and κ Dataset Graph Transduction Graph Product MAP AUC Courses Cora MovieLens Random Walk Tensor Diffusion Cartesian von-neumann Tensor von-neumann Cartesian Sigmoid Cartesian Random Walk Tensor Diffusion Cartesian von-neumann Tensor von-neumann Cartesian Sigmoid Cartesian Random Walk Tensor Diffusion Cartesian von-neumann Tensor von-neumann Cartesian Sigmoid Cartesian ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 47

48 Experiment Results Proposed method (Diff + Cartesian GP) v.s. Baselines Dataset Method MAP AUC ndcg@3 Courses Cora MovieLens MC GRMC TK Proposed MC GRMC TK Proposed MC GRMC TK Proposed ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 48

49 Conclusion Outline 1 Problem Description 2 The Proposed Framework 3 Formulation Product Graph Construction Graph-based Transductive Learning 4 Optimization 5 Experiment 6 Conclusion ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 49

50 Conclusion Conclusion Summary Problem Predicting the missing edges of a bipartite graph with graph-structured vertex sets on both sides. Contribution A novel approach via transductive learning over product graph, efficient algorithmic solution and good results. ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 50

51 Conclusion Conclusion Summary Problem Predicting the missing edges of a bipartite graph with graph-structured vertex sets on both sides. Contribution A novel approach via transductive learning over product graph, efficient algorithmic solution and good results. On-going Work Extend to k Graphs (k > 2) Bipartite Graph k-partite Graph Edge Hyperedge Determine the optimal graph product for any given problem. ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 51

52 Conclusion Thanks! ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 52

Efficient Iterative Semi-supervised Classification on Manifold

. Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani