Deep Learning on Graphs

Similar documents
Deep Learning on Graphs

Semi-supervised Methods for Graph Representation

Graphite: Iterative Generative Modeling of Graphs

This Talk. 1) Node embeddings. Map nodes to low-dimensional embeddings. 2) Graph neural networks. Deep learning architectures for graphstructured

arxiv: v1 [cs.lg] 20 Apr 2017

Machine Learning 13. week

GraphNet: Recommendation system based on language and network structure

arxiv: v4 [cs.lg] 22 Feb 2017

A Higher-Order Graph Convolutional Layer

Deep Graph Infomax. Petar Velic kovic 1,2, William Fedus2,3,5, William L. Hamilton2,4, Pietro Lio 1, Yoshua Bengio2,3 and R Devon Hjelm6,2,3

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis Supplementary Material

Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting

Network embedding. Cheng Zheng

DeepWalk: Online Learning of Social Representations

This Talk. Map nodes to low-dimensional embeddings. 2) Graph neural networks. Deep learning architectures for graphstructured

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

CS224W: Analysis of Networks Jure Leskovec, Stanford University

arxiv: v1 [stat.ml] 10 Mar 2018

Adversarial Network Embedding

Deep Learning for Computer Vision II

From processing to learning on graphs

Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning

Next Steps in Data Mining. Sistemas de Apoio à Decisão Cláudia Antunes

Adversarial Network Embedding

Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Deep Learning Applications

Convolutional Networks for Text

Facial Expression Classification with Random Filters Feature Extraction

Representation Learning on Graphs: Methods and Applications

Recurrent Convolutional Neural Networks for Scene Labeling

arxiv: v1 [cs.si] 14 Mar 2017

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Unsupervised Learning of Spatiotemporally Coherent Metrics

PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks

Graph neural networks February Visin Francesco

Particle Track Reconstruction with Deep Learning

Lecture 10 CNNs on Graphs

Keras: Handwritten Digit Recognition using MNIST Dataset

Lecture 13 Segmentation and Scene Understanding Chris Choy, Ph.D. candidate Stanford Vision and Learning Lab (SVL)

DAGCN: Dual Attention Graph Convolutional Networks

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Tutorial on Machine Learning Tools

Graph Convolutional Networks: Algorithms, Applications and Open Challenges

A Deep Learning primer

arxiv: v1 [cs.lg] 11 Nov 2018

arxiv: v1 [cs.lg] 10 Dec 2017

arxiv: v4 [cs.si] 26 Apr 2018

A Hybrid Neural Model for Type Classification of Entity Mentions

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Multi-Task Self-Supervised Visual Learning

Inductive Representation Learning on Large Graphs

CS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

Unsupervised Learning

Convolutional-Recursive Deep Learning for 3D Object Classification

ATTENTION-BASED GRAPH NEURAL NETWORK FOR SEMI-SUPERVISED LEARNING

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

Deep Attributed Network Embedding

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

End-To-End Spam Classification With Neural Networks

Unrolling Inference: The Recurrent Inference Machine

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

CS468: 3D Deep Learning on Point Cloud Data. class label part label. Hao Su. image. May 10, 2017

Deep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets)

Capsule Networks. Eric Mintun

Deep Learning Cook Book

arxiv: v1 [cs.si] 7 Sep 2018

Learning Graph-Level Representations with Recurrent Neural Networks

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Alternatives to Direct Supervision

Dynamic Routing Between Capsules

A Deep Learning Approach to Vehicle Speed Estimation

ABC-CNN: Attention Based CNN for Visual Question Answering

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

CENTER FOR EMBEDDED & CYBER-PHYSICAL SYSTEMS. System Level CPS design using Non-Euclidean Training method. Jiang Wan.

Deep neural networks II

Face Recognition A Deep Learning Approach

arxiv: v1 [cs.lg] 16 Mar 2018

node2vec: Scalable Feature Learning for Networks

Know your data - many types of networks

Computer Vision Lecture 16

Ryerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro

Keras: Handwritten Digit Recognition using MNIST Dataset

Computer Vision Lecture 16

Recurrent Neural Nets II

arxiv: v1 [cs.si] 20 Nov 2018

ConvolutionalNN's... ConvNet's... deep learnig

Autoencoders. Stephen Scott. Introduction. Basic Idea. Stacked AE. Denoising AE. Sparse AE. Contractive AE. Variational AE GAN.

Deep Learning in Image Processing

Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

Recurrent Neural Networks and Transfer Learning for Action Recognition

arxiv: v2 [cs.lg] 12 Sep 2018

Transcription:

Deep Learning on Graphs with Graph Convolutional Networks Hidden layer Hidden layer Input Output ReLU ReLU, 6 April 2017 joint work with Max Welling (University of Amsterdam)

The success story of deep learning Speech data Natural language processing (NLP) 2

The success story of deep learning Speech data Natural language processing (NLP) Deep neural nets that exploit: - translation invariance (weight sharing) - hierarchical compositionality 2

Recap: Deep learning on Euclidean data Euclidean data: grids, sequences 2D grid 3

Recap: Deep learning on Euclidean data Euclidean data: grids, sequences 2D grid 1D grid 3

Recap: Deep learning on Euclidean data Convolutional neural networks (CNNs) (Animation by Vincent Dumoulin) (Source: Wikipedia) 4

Recap: Deep learning on Euclidean data Convolutional neural networks (CNNs) (Animation by Vincent Dumoulin) (Source: Wikipedia) 4

Recap: Deep learning on Euclidean data Convolutional neural networks (CNNs) (Animation by Vincent Dumoulin) (Source: Wikipedia) Recurrent neural networks (RNNs) (Source: Christopher Olah s blog) 4

Traditional vs. deep learning Traditional approach Hand-designed feature extractor Classifier on top Output 5

Traditional vs. deep learning Traditional approach Hand-designed feature extractor Classifier on top Output End-to-end learning Deep neural network Output 5

CNNs: Message passing on a grid-graph Single CNN layer with 3x3 filter: (Animation by Vincent Dumoulin) 6

CNNs: Message passing on a grid-graph Single CNN layer with 3x3 filter: (Animation by Vincent Dumoulin) 6

CNNs: Message passing on a grid-graph Single CNN layer with 3x3 filter: (Animation by Vincent Dumoulin) 6

CNNs: Message passing on a grid-graph Single CNN layer with 3x3 filter: (Animation by Vincent Dumoulin) Update for a single pixel: Transform messages individually Add everything up 6

CNNs: Message passing on a grid-graph Single CNN layer with 3x3 filter: (Animation by Vincent Dumoulin) Update for a single pixel: Transform messages individually Add everything up Full update: 6

Graph-structured data What if our data looks like this? 7

Graph-structured data What if our data looks like this? or this: 7

Graph-structured data What if our data looks like this? or this: Real-world examples: Social networks World-wide-web Protein-interaction networks Telecommunication networks Knowledge graphs 7

Graph-structured data Graph: Adjacency matrix: A A B C D E A B E A B C 0 1 1 1 0 1 0 0 1 1 1 0 0 1 0 C D D E 1 1 1 0 1 0 1 0 1 0 8

Graph-structured data Graph: Adjacency matrix: A A B C D E A B E A B C 0 1 1 1 0 1 0 0 1 1 1 0 0 1 0 C D D E 1 1 1 0 1 0 1 0 1 0 Model wish list: Trainable in time Applicable even if the input graph changes 8

A naïve approach Take adjacency matrix Concatenate them and feature matrix [A, X] Feed them into deep (fully connected) neural net Done? A B C D E Feat A C B D E A B C D 0 1 1 1 0 1 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 1 1 0 1 1 1? E 0 1 0 1 0 1 0 9

A naïve approach Take adjacency matrix Concatenate them and feature matrix [A, X] Feed them into deep (fully connected) neural net Done? A B C D E Feat A C B D E A B C D 0 1 1 1 0 1 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 1 1 0 1 1 1? E 0 1 0 1 0 1 0 Problems: Huge number of parameters Re-train if graph changes 9

A naïve approach Take adjacency matrix Concatenate them and feature matrix [A, X] Feed them into deep (fully connected) neural net Done? A B C D E Feat A C B D E A B C D 0 1 1 1 0 1 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 1 1 0 1 1 1? E 0 1 0 1 0 1 0 Problems: Huge number of parameters Re-train if graph changes 9

A naïve approach Take adjacency matrix Concatenate them and feature matrix [A, X] Feed them into deep (fully connected) neural net Done? A B E C D A B C D E A B C D E 0 1 1 1 0 1 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 1 1 0 1 1 1 0 1 0 1 0 1 0 Feat We need weight sharing! CNNs on graphs or Graph Convolutional Networks (GCNs)? Problems: Huge number of parameters Re-train if graph changes 9

GCNs with 1st-order message passing (related idea was first proposed in Scarselli et al. 2009) Consider this undirected graph: 10

GCNs with 1st-order message passing (related idea was first proposed in Scarselli et al. 2009) Consider this undirected graph: Calculate update for node in red: 10

GCNs with 1st-order message passing (related idea was first proposed in Scarselli et al. 2009) Consider this undirected graph: Calculate update for node in red: 10

GCNs with 1st-order message passing (related idea was first proposed in Scarselli et al. 2009) Consider this undirected graph: Calculate update for node in red: Update rule: : neighbor indices : norm. constant (per edge) Note: We could also choose simpler or more general functions over the neighborhood 10

GCN model architecture Input: Feature matrix, preprocessed adjacency matrix Hidden layer Hidden layer Input Output ReLU ReLU [Kipf & Welling, ICLR 2017] 11

What does it do? An example. Forward pass through untrained 3-layer GCN model Parameters initialized randomly 2-dim output per node f( ) = [Karate Club Network] 12

Relation to Weisfeiler-Lehman algorithm A classical approach for node feature assignment 13

Relation to Weisfeiler-Lehman algorithm A classical approach for node feature assignment 13

Relation to Weisfeiler-Lehman algorithm A classical approach for node feature assignment Useful as graph isomorphism check for most graphs (exception: highly regular graphs) 13

Relation to Weisfeiler-Lehman algorithm A classical approach for node feature assignment Useful as graph isomorphism check for most graphs (exception: highly regular graphs) 13

Relation to Weisfeiler-Lehman algorithm A classical approach for node feature assignment GCN: Useful as graph isomorphism check for most graphs (exception: highly regular graphs) 13

Semi-supervised classification on graphs Setting: Some nodes are labeled (black circle) All other nodes are unlabeled Task: Predict node label of unlabeled nodes 14

Semi-supervised classification on graphs Setting: Some nodes are labeled (black circle) All other nodes are unlabeled Task: Predict node label of unlabeled nodes Standard approach: graph-based regularization [Zhu et al., 2003] with assumes: connected nodes likely to share same label 14

Semi-supervised classification on graphs Embedding-based approaches Two-step pipeline: 1) Get embedding for every node 2) Train classifier on node embedding Examples: DeepWalk [Perozzi et al., 2014], node2vec [Grover & Leskovec, 2016] 15

Semi-supervised classification on graphs Embedding-based approaches Two-step pipeline: 1) Get embedding for every node 2) Train classifier on node embedding Examples: DeepWalk [Perozzi et al., 2014], node2vec [Grover & Leskovec, 2016] Problem: Embeddings are not optimized for classification! 15

Semi-supervised classification on graphs Embedding-based approaches Two-step pipeline: 1) Get embedding for every node 2) Train classifier on node embedding Examples: DeepWalk [Perozzi et al., 2014], node2vec [Grover & Leskovec, 2016] Problem: Embeddings are not optimized for classification! Idea: Train graph-based classifier end-to-end using GCN Evaluate loss on labeled nodes only: set of labeled node indices label matrix GCN output (after softmax) 15

Toy example (semi-supervised learning) Video also available here: http://tkipf.github.io/graph-convolutional-networks 16

Toy example (semi-supervised learning) Video also available here: http://tkipf.github.io/graph-convolutional-networks 16

Application: Classification on citation networks Input: Citation networks (nodes are papers, edges are citation links, optionally bag-of-words features on nodes) Target: Paper category (e.g. stat.ml, cs.lg, ) (Figure from: Bronstein, Bruna, LeCun, Szlam, Vandergheynst, 2016) 17

Experiments and results Model: 2-layer GCN Dataset statistics (Kipf & Welling, Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017) 18

Experiments and results Model: 2-layer GCN Dataset statistics Classification results (accuracy) no input features (Kipf & Welling, Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017) 18

Experiments and results Model: 2-layer GCN Dataset statistics no input features Classification results (accuracy) Learned representations (t-sne embedding of hidden layer activations) (Kipf & Welling, Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017) 18

Other recent applications Molecules Shapes [Duvenaud et al., NIPS 2015] [Monti et al., 2016] Knowledge Graphs [Schlichtkrull et al., 2017] 19

Link prediction with Graph Auto-Encoders Kipf & Welling, NIPS Bayesian Deep Learning Workshop, 2016 X A Encoder Hidden layer Hidden layer q( Z A, X ) Input Output Z ReLU ReLU p( A Z ) Â GAE Decoder 20

Further reading Blog post Graph Convolutional Networks: http://tkipf.github.io/graph-convolutional-networks Code on Github: http://github.com/tkipf/gcn Kipf & Welling, Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017: https://arxiv.org/abs/1609.02907 Kipf & Welling, Variational Graph Auto-Encoders, NIPS BDL Workshop, 2016: https://arxiv.org/abs/1611.07308 You can get in touch with me via: E-Mail: T.N.Kipf@uva.nl Twitter: @thomaskipf Web: http://tkipf.github.io Project funded by SAP 21