JavaScript is required to use this site. Please enable JavaScript in your browser settings.

Leading Principal Investigator

Team Leads

Earth and Environmental Sciences

The focus area Earth and Environmental Sciences at ScaDS.AI Dresden/Leipzig uses Artificial Intelligence (AI) and big data analytics to describe a multitude of natural phenomena and related effects. This includes for example the impacts of natural hazards and climate change risks (e.g., extreme weather events or destructive geomorphological processes) or biodiversity loss. AI is becoming increasingly important in this context as most parts of the Earth system are continuously monitored by sensors and AI is able to cope with both the volume of data and the heterogeneous data characteristics.

Our applications range from local to global scale and address dynamics of the atmosphere, the oceans, terrestrial regions including the cryosphere and even in-depth processes within the upper crust of the Earth. Air, water, soils, ice, rocks and biodiversity are constantly monitored with dedicated local monitoring systems and satellites. Furthermore, citizen science projects collect data with smartphone apps.

All in all, studying the Earth system and its changes has become a data-intensive research problem. At ScaDS.AI Dresden/Leipzig, we work on the methodological challenges arising in this broad context from different perspectives and consider multiple environmental facets.

We are an interdisciplinary team of domain scientists (geophysics/seismology, geomorphology, physics) and computer scientists and enjoy fruitful discussions and the exchange from multiple perspectives to address today’s pressing questions on the dynamic Earth.

Research Focus

Climate Research

One of the most pressing questions of our time is the prediction of the future climate and the associated risks. Various AI avenues are being explored to harness the potential of Deep Learning for better projections in this context. For instance, neural networks – the backbone of many AI methods – can now be informed by physical principles (Physics Informed Neuronal Networks). This leads to novel types of models that can efficiently represent uncertain processes in complex models. A prominent example is the question of how to represent clouds – still a key uncertainty in climate models. Developments of this kind will also be a key contribution to the international development of Digital Twins of the Earth (e.g. EU Destination Earth).

Terrestrial Ecosystems

For different ecosystems such as forests, grasslands, croplands or even high mountainous regions and glaciers we focus on a branch of AI that incorporates explainability. This means that every prediction we make with AI shall be accompanied by some form of attribution or explanation. For instance, if we aim to predict crop failure, forest dieback, wildfires, flood damage, landslides, debris flows or glacial motion we are also interested in the likely causes of these events. This is often less trivial than expected due to a combination of driving factors. Combining traditional process-based modeling approaches with innovative AI and Big Data analytics will hopefully lead to improved ecosystem management and adaptation strategies at regional to local scales.

Environmental Seismology

Our planet is shaped by a multitude of physical, chemical and biological processes. Most of these processes and their effect on the ground’s properties can be sensed by seismic instruments – as discrete events or ongoing signatures. Seismic methods have been developed, adopted and advanced to study those dynamics at or near the surface of the Earth, with unprecedented detail, completeness and resolution. The community of geophysicists interested in earth surface dynamics and geomorphologists, glaciologists, hydrologists, volcanologists, geochemists, biologists and engineering geologists interested in using arising geophysical tools and techniques is progressively growing and collaboratively advancing the emerging scientific discipline Environmental Seismology.

We here foster methodological developments for an improved understanding of near-surface environmental processes on the exploration scale. Our focus lies on natural hazards and mass movements, especially in the cryosphere and mountainous regions. Based on highly resolved time series data, we combine signal processing, source location (array techniques), modelling and machine learning to conduct our research.


Rhonegletscher (Switzerland) – A multitude of physical processes (e.g., melt water flow, crevassing, wind, basal sliding, rock falls) can be sensed by seismic instruments and analyzed with Machine Learning leading to groundbreaking insights into the glacier’s dynamics (Photo taken by J. Umlauft, 2022).

Earth System Data Cubes

The concept of the Earth System Data Cube (Mahecha et al. 2020) rapidly turned into a popular tool in Earth System Sciences during the last years as it tremendously facilitates data visualization and (interoperable) data handling, including preprocessing or statistical analyses. The original data sets are transformed in space and time to fit to the common grid of the Data Cube which consists of three dimensions: longitude, latitude and time, and further holds a set of variables that are mapped into this spatio-temporal system. Data Cubes are typically chunked, meaning they consist of a set of smaller cubes (chunks) which together build what we call the Earth System Data Cube (ESDC). The ESDC concept allows to treat multiple remotely sensed spatio-temporal data streams as a singular one and therefore enables to interact with a wide range of data.

A parallel development is the growing need for the application of Machine Learning methods to Earth System Sciences data as most parts of the Earth system are continuously monitored by sensors and Machine Learning is able to cope with both the volume of data and the heterogeneous data characteristics. Ideally, classical operations on the ESDC could be extended by Machine Learning applications in order to sustain interoperability. However, there is a conflict between the nature of remotely-sensed data, the structure of the ESDC and the requirements for meaningful Machine Learning applications which need to be addressed:

  1. Sampling the Earth naturally leads to an uneven distribution of data points as a result of its spherical shape. This phenomenon is reinforced by data gaps due to e.g., satellite trajectories or cloud cover. Hence, there is no uniform data distribution across the chunks of the ESDC provided.
  2. Remotely sensed data tends to be auto-correlated within (neighboring) chunks as data points which are in close spatio-temporal vicinity are naturally characterized by a low variance.

Therefore, it is mandatory to enable Machine Learning that respects the basic principles of geo-data way beyond naive applications of Machine Learning in the Earth system context. We focus on the development of sophisticated and efficient sampling strategies for Data Cubes and ML tools that can operate on this large cloud-hosted data sets.

Multivariate Earth System Data Cubes (Figure by Maximilian Söchting, RSC4Earth, Leipzig University)

Multivariate Earth System Data Cubes (Figure by Maximilian Söchting, RSC4Earth, Leipzig University)

Clouds and Global Climate

Extreme weather and climate events (droughts, floods, storms, and many more) are particularly impactful. Therefore, gathering knowledge on their mechanisms and possible distributional shifts under climate change is an important research field.

Extremes can interact: contemporaneous extremes in multiple variables, at separate locations, or extremes at different points in time can exacerbate (or weaken) the strain they put on impacted systems. Co-occurring wind and rain can lead to storm surges, simultaneous droughts can strain the world’s food system, and an early-spring heat wave combined with a growing-period frost can lead to crop losses. These types of events are called “compound” extreme events.

A common way to study extremes is to build a model of the impacted system (crops, forests, rivers, and more), drive this model using weather data, and then examine the results. However, extremes, especially multi-variate events, are rare. Therefore, “weather generators” were developed that simulate time series of artificial weather data with statistical properties similar to observations.

However, most weather generators have limitations: many can only be applied for single sites, for single variables, or require strong assumptions. Machine Learning, especially the recently exploding field of generative AI, holds promise to be applied in this direction. We employ these and other probabilistic methods to generate artificial weather data, thereby aiding researchers who study extreme impacts.

Aims

Naturally, Earth System Science research is a broad field that incorporates a multitude of subdomains and scientists – specialized in e.g., remote sensing, seismology, engineering geophysics, geology, sedimentology, hydrology, meteorology or geoecology. Within those domains, experts are typically assigned to work with a certain data type and/or on a pre-defined spatio-temporal scale and resolution inherent to the data they are interested in. For instance, the remote sensing community commonly works with gridded satellite imagery data covering nearly the entire atmosphere and surface of the Earth with a temporal resolution in the range of hours to days.

This data is well suited to study, e.g., weather patterns, climate change, land cover, or natural disasters such as hurricanes or wildfires. On the other hand, the Earth’s interior is continuously monitored using large seismic networks and dense arrays, which are distributed all over the world with different geometries, acquiring up to thousands of data points per second. Such punctual ground motion measurements allow for, e.g., earthquake observation, subsurface tomography or the investigation of near-surface dynamics and natural hazards. Beyond remote sensing and seismology, there are many more multimodal and multiscale data types from various sensors with different characteristics.

Those fundamental differences in resolution, scale and data type hinder interactions between the ESS subdomains. As of today, our community lacks interoperable tools and workflows that go beyond traditional methods and allow for the integration of this rich, heterogeneous potpourri of data.  

funded by:
Gefördert vom Bundesministerium für Bildung und Forschung.
Gefördert vom Freistaat Sachsen.