Home // Research // Applied AI and Big Data // Life Science and Medicine // Projects // Learning Spatiotemporal Models from Limited Noisy Data

Contact

Prof. Dr. Ivo F. Sbalzarini

Chair of Scientific Computing for Systems Biology

TUD Dresden University of Technology

sbalzarini@mpi-cbg.de

Learning Spatiotemporal Models from Limited Noisy Data

Title: Learning Spatiotemporal Models from Limited Noisy Data

Duration: January 2021 – December 2025

Research Area: Applied Data Science and AI – Life Science and Medicine

Data in the life sciences are typically sparse (few samples in high-dimensional spaces) and noisy (measurement uncertainty and intrinsic biological variability). In the project “Learning Spatiotemporal Models from Limited Noisy Data”, we enable the use of machine learning to infer interpretable, symbolic mathematical models of biomedical dynamics in space and time from sparse (a few hundred samples) noisy (up to 30% noise) data. The resulting equation models can then be analyzed using standard tools from mathematics in order to gain insight about the stability, bifurcations, and physical processes at play in the observed biomedical process.

Aims

Develop a machine learning algorithm for symbolic equation model inference from sparse and noisy spatiotemporal measurement data.
Provide a mathematical way of guaranteeing physical consistency of the inferred models.
Develop a deep neural network architecture to forecast dynamics from past samples in a way that is numerically consistent.
Mathematically characterize the high-dimensional phase space of the inferred models in terms of its topology (e.g., equilibrium points, attractors, and extremal points).
Provide a computable measure for the similarity or equivalence of two dynamical models.

Problem

Learning interpretable mathematical models from observational data has received a lot of research attention in the past 10 years. The main problems with previous methods, however, were that they were sensitive to noise in the data, required a lot of data, or had user-tunable parameters which could be set to obtain almost any result one wanted. Together, these three limitations have so far prevented the use of these methods in the life sciences, where the true model is unknown, and data are sparse and noisy.

Practical Example

In a proof of concept, preliminary results from this project were used to automatically infer the molecular protein interaction network during early embryo polarization of C. elegans from a single microscopy video (see Maddu et al., Proc. R. Soc. A., 2022). The interactions inferred from the video were identical to published ones found in biochemical experiments.

Technology

Sparse high-dimensional regression
Stability selection
Structured norms and group sparsity
Deep learning
Approximate combinatorial optimization using iterative hard thresholding
Persistent homology
Determinantal Point Processes

Outlook

Going forward, the project “Learning Spatiotemporal Models from Limited Noisy Data” will expand into classifying different models according to their high-dimensional loss landscape or their phase space. This touches upon the question of when (and in what sense) two dynamical models are equivalent or “similar”. To approach this question, we will use concepts from topological data analysis. The project will also extend towards learning mutual spatial arrangements of objects, for which we will use determinantal point processes and investigate how they can be stably identified from limited and noisy measurement data.

Publications

D. Sturm, S. Maddu, and I. F. Sbalzarini. Learning locally dominant force balances in active particle systems. ArXiV, 2307.14970, 2023.
S. Maddu, D. Sturm, B. L. Cheeseman, C. L. Müller, and I. F. Sbalzarini. STENCIL-NET for equation-free forecasting from data. Sci. Rep., 13:12787, 2023
S. Maddu, B. L. Cheeseman, I. F. Sbalzarini, and C. L. Müller. Stability selection enables robust learning of differential equations from limited noisy data. Proc. R. Soc. A, 478(2262):20210916, 2022.
S. Maddu, D. Sturm, C. L. Müller, and I. F. Sbalzarini. Inverse Dirichlet weighting enables reliable training of physics informed neural networks. Mach. Learn.: Sci. Technol., 3:015026, 2022.
S. Maddu, B. L. Cheeseman, C. L. Müller, and I. F. Sbalzarini. Learning physically consistent differential equation models from data using group sparsity. Phys. Rev. E, 103: 042310, 2021.
S. Maddu, D. Sturm, B. L. Cheeseman, C. L. Müller, and I. F. Sbalzarini. Learning computable models from data. In F. Chinesta, R. Abgrall, O. Allix, and M. Kaliske, editors, Proc. 14th World Congress on Computational Mechanics (WCCM), pages 1–6, DOI: 10.23967/wccm-eccomas.2020.190, 2021. ECCOMAS.

Team

Lead

Prof. Dr. Ivo F. Sbalzarini

Team Members

Dominik Sturm
Suryanarayana Maddu
Dr. Abhishek Behera

Partners

funded by:

Gefördert vom Bundesministerium für Bildung und Forschung.

ScaDS.AI Dresden/Leipzig (Center for Scalable Data Analytics and Artificial Intelligence) is a center for Data Science, Artificial Intelligence and Big Data with locations in Dresden and Leipzig.