arxiv: v1 [cond-mat.dis-nn] 30 Dec 2018

Similar documents
A. Further experimental analysis. B. Simulation data

Bilevel Sparse Coding

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

arxiv: v1 [cs.cv] 20 Dec 2016

COMPUTER EXERCISE: POPULATION DYNAMICS IN SPACE September 3, 2013

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Structured Attention Networks

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Chaos, fractals and machine learning

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Spatial Outlier Detection

learning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function:

Modeling and Optimization of Thin-Film Optical Devices using a Variational Autoencoder

Sequential Dependency and Reliability Analysis of Embedded Systems. Yu Jiang Tsinghua university, Beijing, China

Learning Algorithms for Medical Image Analysis. Matteo Santoro slipguru

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

Perceptron: This is convolution!

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Website.

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks

Chapter 7: Competitive learning, clustering, and self-organizing maps

Optimal Segmentation and Understanding of Motion Capture Data

Deep Reinforcement Learning

CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS

REGRESSION ANALYSIS : LINEAR BY MAUAJAMA FIRDAUS & TULIKA SAHA

Parameter Estimation in Differential Equations: A Numerical Study of Shooting Methods

Mid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr.

Climate Precipitation Prediction by Neural Network

Slides credited from Dr. David Silver & Hung-Yi Lee

Efficient Feature Learning Using Perturb-and-MAP

Graph Neural Network. learning algorithm and applications. Shujia Zhang

Diffusion Spline Adaptive Filtering

LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES

XES Tensorflow Process Prediction using the Tensorflow Deep-Learning Framework

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology

Research on the New Image De-Noising Methodology Based on Neural Network and HMM-Hidden Markov Models

Deep Learning Applications

Separating Objects and Clutter in Indoor Scenes

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing

Convex and Distributed Optimization. Thomas Ropars

Neural Network Neurons

Lecture 21 : A Hybrid: Deep Learning and Graphical Models

C. Poultney S. Cho pra (NYU Courant Institute) Y. LeCun

Artificial Intelligence for Robotics: A Brief Summary

Learning and Generalization in Single Layer Perceptrons

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

A hierarchical network model for network topology design using genetic algorithm

Edge and local feature detection - 2. Importance of edge detection in computer vision

Feature Extraction and Learning for RSSI based Indoor Device Localization

House Price Prediction Using LSTM

Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting

arxiv: v3 [cs.cr] 21 Sep 2016

Week 3: Perceptron and Multi-layer Perceptron

A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009]

Data Mining. Neural Networks

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions

Stacked Denoising Autoencoders for Face Pose Normalization

Supplementary Materials for Learning to Parse Wireframes in Images of Man-Made Environments

Self-Organizing Maps for cyclic and unbounded graphs

Time Series Prediction and Neural Networks

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks

WEINER FILTER AND SUB-BLOCK DECOMPOSITION BASED IMAGE RESTORATION FOR MEDICAL APPLICATIONS

A Combined Encryption Compression Scheme Using Chaotic Maps

Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles

Link Lifetime Prediction in Mobile Ad-Hoc Network Using Curve Fitting Method

ADVANCED IMAGE PROCESSING METHODS FOR ULTRASONIC NDE RESEARCH C. H. Chen, University of Massachusetts Dartmouth, N.

Kernels for Structured Data

A Novel Method for Activity Place Sensing Based on Behavior Pattern Mining Using Crowdsourcing Trajectory Data

The Gene Modular Detection of Random Boolean Networks by Dynamic Characteristics Analysis

Particle Filtering. CS6240 Multimedia Analysis. Leow Wee Kheng. Department of Computer Science School of Computing National University of Singapore

Skåne University Hospital Lund, Lund, Sweden 2 Deparment of Numerical Analysis, Centre for Mathematical Sciences, Lund University, Lund, Sweden

Machine Learning Classifiers and Boosting

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

An improved image encryption algorithm based on chaotic maps

DEEP LEARNING OF COMPRESSED SENSING OPERATORS WITH STRUCTURAL SIMILARITY (SSIM) LOSS

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION

An image encryption based on DNA coding and 2DLogistic chaotic map

CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points]

3D Mesh Sequence Compression Using Thin-plate Spline based Prediction

Two Dimensional Microwave Imaging Using a Divide and Unite Algorithm

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

BACK AND FORTH ERROR COMPENSATION AND CORRECTION METHODS FOR REMOVING ERRORS INDUCED BY UNEVEN GRADIENTS OF THE LEVEL SET FUNCTION

Lecture #11: The Perceptron

Distributed non-convex optimization

Temperature Distribution Measurement Based on ML-EM Method Using Enclosed Acoustic CT System

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

Using Machine Learning to Optimize Storage Systems

Face recognition based on improved BP neural network

Variational Autoencoders. Sargur N. Srihari

Research on Pruning Convolutional Neural Network, Autoencoder and Capsule Network

Clustering and Unsupervised Anomaly Detection with l 2 Normalized Deep Auto-Encoder Representations

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

Natural Language Processing with Deep Learning CS224N/Ling284. Christopher Manning Lecture 4: Backpropagation and computation graphs

Learning to Track Motion

Supervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples.

Transcription:

A General Deep Learning Framework for Structure and Dynamics Reconstruction from Time Series Data arxiv:1812.11482v1 [cond-mat.dis-nn] 30 Dec 2018 Zhang Zhang, Jing Liu, Shuo Wang, Ruyue Xin, Jiang Zhang 1. Introduction School of systems science, Beijing Normal University Reconstructing network structure and the underlying dynamics from temporal data is a fundamental problem in network science and has various applications in biological, climate, financial and economic systems. However, conventional reconstruction methods, including compressing sensor[? ], maximum entropy[? ], granger causal inference[? ] etc. require the underlying dynamical rules having specific forms, or even to be linear. Although the method presented in [? ] do not have these limitations, it can only be applied to continuous dynamical systems so that the discrete derivative can be calculated. Recently, Kipf et al. [? ] proposed an Encoder-Decoder framework called NRI (Neural Relation Inference) based on graph network model[? ] and deep learning techniques to reconstruct the network structure and dynamics simultaneously, however, the sizes of the graphs to be reconstructed are very small (smaller than 40 nodes), and the dynamics are always continuous. A general framework for reconstructing network structure and dynamics from the time series data which can be applied for various dynamics including continuous, discrete or even binary ones is necessary. In this work, we present a deep learning framework for dynamics learning and network reconstruction from the observed time series data. This model is composed of two parts, network generator, and dynamics learner. In the first part, we use Gumbel-softmax[? ] method to build the graph directly. While, for the dynamics learner, we adopt a graph network model[? ] with five layers. The network generator and dynamics learner alternate to operate such that the whole model converge. The whole framework is named as Gumbel Graph Network (GGN). We applied GGN on three representative dynamical models, the Kuramoto oscillator model representing for continuous Preprint submitted to Journal Name January 1, 2019

dynamics, the coupled map lattices model representing for discrete dynamics, and the boolean network representing for binary dynamics. Our experimental results show that our model can not only reconstruct the network structure in high accuracy, but also can learn the various dynamics efficiently, and all the graph network structures are same for those three models. Next, we will briefly introduce our method, experiments, and results. 2. Method 2.1. Model In our method, the input consists of trajectories generated by a dynamical model of N interacting objects which can be modelled as an unweighted directed graph. The target is to learn the network structure (Specifically, adjacency matrix) and the dynamical model simultaneously in an unsupervised way. Our algorithm consists of two jointly trained parts: A network generator that generates a discrete network by using Gumbel Softmax trick[? ]; and a dynamics learner that can utilise the network generated by the generator and one-step trajectory value to predict the value in the next one or multiple steps. We alter the network generator for P steps and the dynamics learner for Q steps. The P and Q are two hyper-parameters which take different values for different models. The objective of our model is to minimise the errors between the predictions and the time series data. When the time series to be predicted is a discrete sequence with finite countable symbols, the cross-entropy objective function is adopted otherwise the mean square errors are used. Figure 1. outlines the basic structure of our framework. Figure 1: Basic structure of our framework. 2

2.2. Network Generator One of the difficulties for reconstructing a network from the data is the discreteness of the graph data such that the back-propagation technique which is widely used in deep learning and artificial neural networks cannot be applied on networks. To conquer this problem, Gumbel-softmax trick is used to reconstruct the adjacency matrix of the network directly. This technique uses a continuous distribution to approximate samples from a discrete distribution such that the back-propagation algorithm can be also applied. Suppose we will reconstruct a network with size N, and the adjacency matrix is {a ij } N N. Where, a ij s are real values with gumbel-softmax function with the components of p ij and the temperature parameter τ, which can be written by: a ij = exp((log(p ij ) + ξ ij )/τ) exp((log(p ij ) + ξ ij )/τ) + exp((log(1 p ij ) + ξ ij )/τ), (1) where ξ ij s and ξ ijs are i.i.d. random numbers following the gumbel distribution. This calculation use a continuous function with random noise to simulate a discontinuous sampling process. And the temperature parameter τ adjust the sharpness of the output. When τ 0, a ij will take 1 with probability p ij and 0 with probability 1 p ij. p ij s are all trainable parameters, which can be adjusted according to the back propagation algorithm. Thanks to the features of Gumbel-softmax trick, the gradient information can be back propagated through the whole computation graph although the process of sampling a random number is non-differentiable. 2.3. Dynamics Learner Learning with graph-structured data is a hot trend in deep learning research areas. It focuses on the effective representation learning of nodes in a graph. Recently, Graph Networks (GNs) [? ] have been widely investigated and have achieved state-of-the-art performance in node classification, link prediction, et al. In general, a GN takes the graph structure A and node features X as its input to learn a representation of each node. We input the adjacency matrix constructed by the generator into the GN directly, and the errors are calculated and back forwarded. The whole dynamics learner can be presented as a function: 3

X t = f(x t 1, A) (2) where, X t is the state vector of all N nodes at time step t, and A is the adjacency matrix constructed by the network generator. If we want to make multiple prediction, we have: X t,,t = f(x t 1, A), (3) where X t,,t represents the states of the system from time t to T. f is the transitional function, whose structure can be visualized by Figure 2 In figure 2, the symbol represents the Kronecker product while the time operator is replaced with the operator to make two elements a pair. Therefore, if X is a vector, then X X T is an N N matrix, and the element at the ith row and jth column is < x i, x j >, where x i is the ith element in X. The symbol * represents the element-wised product, and the symbol C represents the concatenation operator. 3. Experiments We have considered three representative dynamical systems on networks to validate our model: Coupled Map Lattice model standing for discrete dynamics, Kuramoto model representing for continuous dynamics and Boolean Network dynamics representing for boolean dynamics. Next, we will briefly discuss these three model and show our results. 3.1. Coupled Map Lattices A coupled map lattice is a dynamical system with discrete time and continuous state variables defined on a chain with a periodic boundary condition. In the past 30 years, studies in coupled map lattices have improved our understanding of spatiotemporal chaos systems. In our work, we consider coupled systems on a complex network: x t+1 (i) = (1 s)f(x t (i)) + s deg(i) j neighbor(i) where i denotes a node and we choose the following logistic map: f(x t (j)) (4) f(x) = 1 µx 2 (5) 4

Figure 2: Dynamics learner architecture. With adequate parameters of s and u, the model exhibits abundant quantitative universality classes such as spatial bifurcation and frozen chaos. We simulate the coupled map lattices model with node number ranging from 10 to 100 on different complex networks in the different region of chaotic behaviour. 5

Figure 3: (a) The objective adjacency matrix (b) A typical dynamics in CML model on a random 4-regular graph for N = 20, s = 0.2 and µ = 1.8. (c)-(e) Adjacency matrix sampled from Gumbel Generator. And we input all these generated data into our framework. Figure 3 presents our results obtained on a random 4-regular network with the number of nodes N = 20. It is interesting to see that after 5 epochs, our model has revealed the network topology accurately. Table 3.1 lists three numerical experimental results on different networks with various parameters of the dynamics. Table 1: MSE and accuracy for network reconstruction for simulation on random 4-regular graph. The prediction steps is 5 3.2. Kuramoto model The Kuramoto model is a nonlinear system of phase-coupled oscillators. The following differential equation is the model we use in the paper. We simulate 1D trajectories by solving this equation with a fourth-order Runge- 6

Table 2: Mean squared error (MSE) in predicting future states for simulations Kutta integrator with step size 0.01. dφ i dt = ω i + k ij sin(φ i φ j ) (6) j i The nodes which are phase-coupled oscillators here with undirected edges. k ij = 1 if two oscillators have a connection, or k ij = 0. ω i is intrinsic frequency which is sampled uniformly from [1, 10), and φ i is initial phase which is sampled uniformly from [1, 2π). Then we subsample the simulated φ by a fator of 10 and create the trectories x i by concatenating dφ i /dt and sin(φ i ). We simulate Kuramoto dynamics on a sample small network which is shown in Figure 4. We generate 50k training examples to train and test our method, among which 10k are tested. The following figure 5 shows a sample of the test data. The left part is the trajectory learned by our method, and the right part is the results generated by the simulation. The mean square errors of the 20 steps prediction according to only one step input is 1.20e- 2. And we compare our result with two baseline methods as shown in Table 3.2. This means that our method is capable of capturing the graph dynamics better than the state-of-the-art algorithms. 3.3. Boolean Network In this model, every variable in a Boolean Network has a possible value of 0 or 1 and a boolean function is assigned to the node. The function takes the states of its neighbours as inputs and return a binary value that determines the state of the current node. We simulate the Boolean Network model with node numbers varying from 5 to 50 on different complex networks. And we set adjust dynamics for each model to change their behaviours from low randomness to chaos. Figure 6 shows the training process of our model with 20 nodes, the number of the incorrect entries that our network can recover is shown in the left sub-figure, and the loss function which is the cross entropy is shown in the right sub-figure. Both curves drop to low values closed to zero. Results in Boolean Networks are summarized in Table 3.3 7

Figure 4: Network structure of 5 interacting objects Figure 5: Qualitative comparison of model prediction (left) and the ground truth trajectories (right) Table 3: Results in Boolean Networks. 8

Figure 6: Error number of network reconstruction and training loss varies with training iteration with the Boolean network of 20 nodes. 4. Conclusion In this work, we present Gumbel Graph Network, a model-free deep learning framework for dynamics learning and network reconstruction from the observed time series data. Our method requires no prior knowledge about underlying dynamics and has shown the state-of-the-art performance in three typical dynamical systems on complex networks. Further studies considered larger networks and more detailed experiments are ongoing and will be finished soon. References 9