Using SciML to predict the time evolution of a complex network.

07/28/2022, 11:40 AM — 11:50 AM UTC

Green

Abstract:

Modeling the temporal evolution of complex networks is still an open challenge across many fields. Using the SciML ecosystem in Julia, we train and simplify a Neural ODE on the low-dimensional embeddings of a temporal sequence of networks. In this way, we discover a dynamical system representation of the network that allows us to predict its temporal evolution. In the talk we’ll show how the tight integration of SciML, Network, and Matrix Algebra packages in Julia opens new modeling directions.

Description:

Introduction

Complex networks can change over time as vertices and edges between vertices get added or removed. Modeling the temporal evolution of networks and predicting their structure is an open challenge across a wide variety of disciplines: from the study of ecological networks as food-webs, to predictions about the structure of economical networks; from the analysis of social networks, to the modeling of how our brain develops and adapts during our lives. In their usual representation, networks are binary (an edge is either observed or not), sparse (each vertex is linked to a very small subset of the network), and large (up to billions of nodes), and changes are discrete rewiring events. These properties make them hard to handle with classic machine learning techniques and have barred the use of more traditional mathematical modeling such as differential equations. In this talk, we show how we used Julia, and in particular the Scientific Machine learning (SciML) framework, to model the temporal evolution of complex networks as continuous, multivariate, dynamical systems from observational data. We took an approach cutting across different mathematical disciplines (machine learning, differential equations, and graph theory): this was possible largely thanks to the integration of packages like Graph.jl (and the integrated MetaGraph.jl package), LinearAlgebra.jl (and other matrix decomposition packages), and the SciML ecosystem, e.g., DiffEqFlux.jl.

Methodology

To translate the discrete, high-dimensional, dynamical system into a continuous, low-dimensional one, we rely on a network embedding technique. A network embedding maps the vertices of a network to points in a (low-dimensional) metric space. We adopt the well-studied Random Dot-Product Graphs statistical model: the mapping is provided by a truncated Singular Value Decomposition of the network’s adjacency matrix; to reconstruct the network we use the fact that the probability of interaction between two vertices is given by the dot product of the points they map to. The decomposition of the adjacency matrices and their alignment is a computationally intensive step, and we tackle it thanks to the fast matrix algorithms available for Julia and their integration with packages that allow for network data wrangling (like Graphs.jl).
In the embedding framework, a discrete change in the network is modeled as the effect of a continuous displacement of the points in the metric space. Our goal then, is that of discovering from the data (the network observed at various points in time) an adequate dynamical system capturing the laws governing the temporal evolution of the complex network. This is possible thanks to a pipeline that combines Neural ODEs and the identification of nonlinear dynamical systems (such as SiNDY).

In general, each node may influence the temporal evolution of every over node in the network, and, if we are working in a space with dimension d, and N nodes, this translates to a dynamical system with NNd variables. As networks may often have thousands or millions of nodes, that number can be huge. In our talk we are going to discuss various strategies we adopted to tame the complexity of the dynamical system.

Future Development

As a proof of concept, we tested our modeling approach on a network of wild birds interacting over the span of 6 days, collected by the team behind the Animal Social Network Repository (ASNR). The network has 202 vertices (birds), and 11899 edges (contact between birds). In the talk we will showcase this application to show strengths and current limitations of our novel approach.

We are now working on three fronts:

we need to scale up the framework, so to model very large networks (for example to model social networks, where being able to predict what people might link with others in the future could be a tool to fight the growing problem of misinformation and disinformation);
we are considering stepping from Neural Differential Equations to Universal Differential Equations, both to capture any preexisting knowledge of the network dynamical system, and to help with the training complexity;
we are exploring data augmentation techniques to interpolate between the estimated embeddings, other embeddings, and other neural network architectures.

These three development directions constitute interesting challenges for the Julia practitioner interested in extending Julia modeling abilities. We will discuss them in the talk and suggest how everyone can contribute

All the code will be made available in a dedicated git repository and we are preparing a detailed publication to illustrate our approach.