JavaScript is required to use this site. Please enable JavaScript in your browser settings.

Open Topics for Research Associates / PhD Students (f/m/x) within the ScaDS.AI Graduate School

Within ScaDS.AI, we welcome new researchers to our teams in Dresden and Leipzig. This page lists suggestions for research topics that are currently available within the ScaDS.AI Graduate School, together with mentors and host institutions. Feel free to look around to decide which topics excite you most and best match your skills!

The official Job-Posting will be published soon.

Knowledge representation and inference

Belongs to project: Linear Time Algorithms for Ontology-Mediated Querying (LinOMQ)

Supervisors: Prof. Dr. Carsten Lutz, Prof. Dr. Sebastian Rudolph

Location: Leipzig

% of Position: 100%

Abstract: Ontology-Mediated Querying (OMQ) is an important paradigm for semantically enhanced data access in artificial intelligence (AI) applications. The main idea is that an ontology is used to enrich data with domain knowledge, enabling more flexible queries and more complete answers. Since AI applications often require huge quantities of data, it is paramount that OMQ can be implemented in an efficient way. In particular, the data complexity of querying should be polynomial time, preferably even linear time.
The aim of this project is to investigate the possibilities and limits of linear-time ontology-mediated querying. Revolving around this aim, we will study many relevant setting. On the one hand, we will consider several modes of query evaluation including single-testing, all-testing, enumeration, and direct access. On the other hand, we will study a broad range of ontology and query languages. Regarding the former, our focus will be on existential rules (also known as tuple generating dependencies), including formalisms such as guarded and frontier-guarded rules, acyclic sets of rules, sticky sets of rules, and others. Regarding the latter, we will consider conjunctive queries, unions thereof, and acyclic variations. As the outcome of this project, we envision a rather complete characterization of those OMQs that can be evaluated in linear time versus those that cannot; the lower bounds will most likely depend on suitable assumptions from fine-grained complexity theory.

Belongs to project: Combining Description Logics with Argumentation Frameworks

Supervisors: Prof. Dr.-Ing. Franz Baader, Prof. Dr. Ringo Baumann

Location: Dresden

% of Position: 100%

Abstract: Description Logics (DLs) and Argumentation Frameworks (AFs) are prominent symbolic AI formalisms, offering complementary approaches to knowledge representation (KR). DLs provide ways to define key notions of an application domain and support reasoning to derive new knowledge. AFs represent arguments and their relationships (such as attack and support) and facilitate conflict resolution through various argumentation semantics. Despite their prominence in KR, there has been little interaction between these two areas so far. The project investigates their combination, with the goal of constructing methods for improving the quality of AFs and DL knowledge bases used in applications, and thus improving the trust in the results produced by the investigated symbolic AI methods.

Mathematical foundations of AI and representation learning

Belongs to project: Information Theory and Geometric Inference for Time-Dependent Point Cloud Data

Supervisors: Prof. Dr. Ivo F. Sbalzarini, Prof. Dr. Jürgen Jost

Location: Dresden

% of Position: 100%

Abstract: In the era of rapidly increasing data volumes, modern experimental measurement technologies can capture a wide range of features, including non-scalar values for each data point. This makes it more important than ever to incorporate as much information as possible when modeling complex data sets. The core idea in our approach is to represent a configuration of points (e.g., measurement samples) by focusing on the relative positioning of each point among its neighbors, thereby capturing the local geometry. Traditionally, this modeling could be achieved by comparing observations, such as in point clouds where Euclidean distance is used to compute differences between features and aggregate them within a feature set. However, we propose a novel approach by considering the information exchanged between points, which provides a deeper and more nuanced understanding of their relationships.

Belongs to project: Information Theory and Geometric Inference for Time-Dependent Point Cloud Data

Supervisors: Prof. Dr. Sayan Mukherjee, Prof. Dr. Ivo F. Sbalzarini

Location: Leipzig

% of Position: 75%

Abstract: We are interested in the scientific interpretation of the information that is represented in continuously labeled point clouds. Examples of such point clouds could be biomedical images. Images are usually represented on a Cartesian point cloud given by the pixels or voxels (in 3D). However, content-adaptive image representations, such as the Adaptive Particle Representation (APR) of images, produce more general point clouds. We propose further developments in mathematical information theory as a way of systematically deriving such information measures over point cloud data. 

Belongs to project: Biologically informed representation learning for single-cell analyses

Supervisors: Prof. Dr. Markus Scholz, Dr. rer. nat. Kristin Reiche

Location: Leipzig

% of Position: 65%

Abstract: Molecular data from single-cell technologies are transforming our understanding of biological mechanisms and disease pathways. To enhance explainability, we aim to develop methods that integrate biological pathways and biomedical domain knowledge into representation learning for multimodal single-cell data. This approach will improve data interpretation, enable biological predictions, and facilitate cross-species comparisons.
We are looking for candidates with a strong background in (bio)informatics or a related discipline. Experience with machine learning methods and single-cell data analysis is advantageous, and an interest in biology is encouraged. We offer a stimulating interdisciplinary working environment in a cutting-edge research field. We look forward to receiving your application.

Belongs to project: Biologically informed representation learning for single-cell analyses

Supervisors: Dr. rer. nat. Kristin Reiche, Prof. Dr. Jens Lehmann

Location: Leipzig

% of Position: 65%

Abstract: Characterizing biological cells through single-cell sequencing has revolutionized biomedical research. The molecular profiling of individual cells from various tissues aids drug development, drug repurposing, and side effect assessment. Recent advancements in Large Language Models (LLMs) have led to their application in genomics and single-cell studies. The first pre-trained language models, like scBERT for cell-type annotation and scGPT, the first foundation model for cell biology, are now available. However, our understanding of the extent to which these cell language models may generate misleading information is still in its early stages. This PhD thesis aims to integrate biological knowledge, primarily represented as graph-structured data, into generative models for cell biology. The goal is to enhance the accuracy and inference capabilities of LLMs in cell biology.

Belongs to project: Biologically informed representation learning for single-cell analyses

Supervisors: Prof. Dr. Martin Sedlmayr, Prof. Dr. Markus Scholz

Location: Dresden

% of Position: 100%

Abstract: Single-cell RNA sequencing (scRNA-seq) faces challenges like rare cell type scarcity and imbalanced datasets, limiting analytical precision. Synthetic data generation and advanced annotation methods may overcome these hurdles. Key Focus Areas: Develop cutting-edge synthetic data tools (GANs, autoencoders, ConvGeN). Test and integrate multi-omics layers ((epi)genomics, proteomics) for scRNA-seq. Compare LLM-based annotation with traditional methods for rare-cell analysis. Collaborate on real-world applications in projects like PM4Onco and SaxoCell. Candidate Profile: Strong background in bioinformatics or AI, motivated to tackle real-world biomedical challenges using cutting-edge computational methods.

Belongs to project: Biologically informed representation learning for single-cell analyses

Supervisors: Prof. Dr. Bogdan Franczyk, Dr. Nico Scherf

Location: Leipzig

% of Position: 100%

Abstract: Recent advances in image-driven AI underscore the power of computer-assisted analysis in biology and medicine. Yet the reliance on expert-curated cases curbs the utility of large databases lacking consistent annotations. Self-supervised learning (SSL) offers a shift by enabling label-free pretraining, drastically reducing curated data needs. It can also leverage text and time series to enrich image-based feature extraction. This PhD project aims to refine SSL methods—such as unsupervised contrastive learning—for diverse biological datasets. We will adopt and advance VAE-based models and joint embedding architectures tailored for microscopy and neuroimaging. These approaches promise deeper insights into diseases like prostate cancer and clearer understanding of AI-driven data embeddings.

Belongs to project: Systems AI: A High-Dimensional Dynamical Systems Perspective on Neural Networks

Supervisors: Prof. Dr. Bernd Rosenow, Prof. Dr. Frank Cichos

Location: Leipzig

% of Position: 75%

Abstract: This project applies methods from statistical physics to explore how large neural networks learn in high-dimensional settings. Using random matrix theory as a key tool, we will analyze the weight matrices of models like transformers to understand how they evolve from random initializations to meaningful structures. By examining the statistical patterns in these matrices, we aim to pinpoint which components capture the most relevant information and how they relate to tasks such as language understanding. Our approach combines both theoretical insights and practical experiments on benchmark datasets, providing a hands-on introduction to modern machine learning methods. We welcome candidates with a solid background in statistical physics who are excited to learn about deep learning and contribute to bridging the gap between physics theory and AI practice.

Belongs to project: Systems AI: A High-Dimensional Dynamical Systems Perspective on Neural Networks

Supervisors: Dr. Nico Scherf, Prof. Dr. Sayan Mukherjee

Location: Leipzig

% of Position: 75%

Abstract: Join our PhD project on geometry-informed representation learning for high-dimensional (e.g. neural) data. You will develop robust, theory-driven methods to uncover and model manifold structures, using tools from Riemannian geometry, topological data analysis, and Bayesian approaches. The work explores new ways to integrate geometric constraints and uncertainty quantification, enabling more interpretable and efficient deep manifold learning. We will apply these methods across domains, revealing hidden patterns and structures in complex datasets. Combining manifold learning with generative models, we also explore advanced VAE variants, hyperbolic latent spaces, and chart-based autoencoders—aiming for flexible, data-driven discovery of underlying geometric and topological properties.

Belongs to project: Systems AI: A High-Dimensional Dynamical Systems Perspective on Neural Networks

Supervisors: Prof. Dr. Ivo F. Sbalzarini, Dr. Nico Scherf

Location: Dresden

% of Position: 100%

Abstract: Understanding the dynamics of high-dimensional systems is crucial for interpreting and predicting complex behaviors. However, inferring system dynamics from observed data, especially in the presence of latent variables or unobserved confounders, remains a significant challenge. This project aims to address this challenge by developing methods to discover interpretable latent dynamics from high-dimensional data, building on previous work of the Sbalzarini group.
The key questions we aim to address include: How can we infer system dynamics from latent representations of high-dimensional systems (e.g., from autoencoders)? What are the limits of identifiability and how can we quantify uncertainty, particularly when confounders are unobserved? How can we determine whether meaningful system dynamics exist in a given latent representation? And finally, how can we steer a deep neural network towards learning representations in which the dynamics become “as simple as possible” (ideally linear)?

Belongs to project: Systems AI: A High-Dimensional Dynamical Systems Perspective on Neural Networks

Supervisors: Prof. Dr. Frank Cichos, Dr. Nico Scherf

Location: Leipzig

% of Position: 75%

Abstract: Join our PhD project exploring emergent complexity and latent geometry in real microscopic physical systems, bridging advanced computational tools and experimental setups. We investigate active particle ensembles and random photonic media to discover how high-dimensional dynamics can be captured and controlled in low-dimensional latent spaces. By correlating neural network states with real-world many-body systems, we reveal how large-scale complexity emerges from simple building blocks. This approach enables novel feedback loops and coarse-graining perspectives, driving insights into the interplay of physics and AI, and opening paths to robust, low-energy computational substrates.

 

Scalable ML and LLM inference

Belongs to project: Unconventional Computing for Efficient LLM Inference

Supervisors: Prof. Dr.-Ing. Diana Göhringer, Prof. Dr. Wolfgang E. Nagel

Location:

% of Position: 100%

Abstract: This PhD project will investigate and realize scalable FPGA-based reconfigurable computing architectures for Large Language Models (LLMs). A strong focus will be put on modular hardware components, temporal and spatial architectures, memory-centric technologies, and efficiency techniques such as approximation, sparsity, and quantization.

Belongs to project: Unconventional Computing for Efficient LLM Inference

Supervisors: Prof. Dr.-Ing. Jeronimo Castrillon, Prof. Dr.-Ing. Diana Göhringer

Location:

% of Position: 100%

Abstract: The extreme compute and memory requirements of Large Language Models (LLMs) have led to disruption on data-centric architectures. In-memory and near-memory systems, for instance, are promising approaches since they considerably reduce the need to move data within the computing system. This project addresses the challenge of devising programming and runtime methodologies to automatically optimize high-level code (written in Python) for efficient execution on unconventional computing systems.

Belongs to project: Unconventional Computing for Efficient LLM Inference

Supervisors: Prof. Dr. Wolfgang E. Nagel

Location: Dresden

% of Position: 100%

Abstract: AI models require often a large amount of compute resources for both, during their training phases and in their deployment for inferences tasks. This project focuses on the efficiency optimization during the learning phase of especially large models, such as Large Language Models (LLMs) with respect to available hardware features on the compute infrastructure. Here, so called performance investigation tools for the collection of data during the training procedure allows the development of strategies to efficiently train larger models on modern hardware. In a second phase alternative compute architectures shall be investigated and compared to the state-of-the-art in high performance computing.

 

Belongs to project: Unconventional Computing for Efficient LLM Inference

Supervisors: Prof. Dr. Wolfgang E. Nagel

Location: Dresden

% of Position: 100%

Abstract: AI models require often a large amount of compute resources for both, during their training phases and in their deployment for inferences tasks. This project focuses on the development of optimization strategies to increase the efficiency of large models in the inference stage. Here, often limitation arise in the deployment of larger models on the given architecture (memory, bandwidth, latency, etc.). The project aims to provide optimization strategies for the usage of larger models, such as LLMs, on available and alternative hardware architectures.

 

Belongs to project: AI-Accelerated Quantum Chemistry for Enhanced QSAR Modeling on the Spinnaker2 platform

Supervisors: Prof. Dr. Jens Meiler, Prof. Dr.Ing. Christian Mayr

Location: Leipzig

 % of Position: 75%

Abstract: Quantum chemical calculations provide detailed insights into molecular interactions but are computationally expensive for large-scale drug discovery. Traditional methods simplify models, neglecting electron density distributions. This research integrates machine learning (ML) to predict quantum chemical properties efficiently for larger systems, using descriptors like electron densities and molecular orbitals derived from methods such as density functional theory. A researcher with expertise in theoretical chemistry will design and curate diverse datasets for robust ML training. By combining ML with Quantitative Structure-Activity Relationship frameworks, this scalable approach identifies quantum-informed pharmacophores, enhances drug discovery, and addresses generalization challenges with physics-informed refinements.

Belongs to project: AI-Accelerated Quantum Chemistry for Enhanced QSAR Modeling on the Spinnaker2 platform

Supervisors: Prof. Dr. Jens Meiler, Prof. Dr.Ing. Christian Mayr

Location: Leipzig

 % of Position: 75%

Abstract: Ultra-large libraries make billions of molecules available for rapid prediction and validation of drug candidates, but structure-based drug discovery methods face performance limits as libraries grow toward trillions. This project addresses these challenges by leveraging the massively parallel SpiNNaker2 computing platform. The work involves implementing and benchmarking neural network-based QSAR models and generative diffusion methods on Spinnaker2 to predict protein-ligand complexes and binding affinities. By integrating protein structure prediction methods like AlphaFold2 and molecular dynamics, the project aims to enable exhaustive screening of multi-billion compound libraries, advancing precision medicine through the identification of novel ligands for rare protein variants.

Belongs to project: AI-Accelerated Quantum Chemistry for Enhanced QSAR Modeling on the Spinnaker2 platform

Supervisors: Prof. Dr.Ing. Christian Mayr, Prof. Dr. Jens Meiler

Location: Dresden

 % of Position: 100%

Abstract: In recent years, AI methods have revolutionalized computer-aided drug discovery. Quantum Mechanical (QM) methods, which offer the most accurate biophysical description of drug properties, are too computationally intensive for ultra-large molecule library screening. This project will take an innovative approach to overcome this limitation by developing AI-accelerated drug discovery methods leveraging the neuromorphic, massively-parallel SpiNNaker2 platform. For this AI-driven QM descriptors will be developed and integrated, using equivariant neural networks to predict electron density. The generation of QM training data will be accelerated on SpiNNaker2 through a Monte Carlo (MC) approach to Density Functional Theory (DFT). Electron density-based descriptors can then be incorporated in QSAR modeling and other downstream tasks for enhanced and faster drug discovery.

Belongs to project: Transfer Learning Architecture Search

Supervisors: Prof. Pascal Kerschke, Dr. Aaron Klein

Location: Dresden

% of Position: 100%

Abstract: Hyperparameter optimization (HPO) and neural architecture search (NAS) automate the process of selecting the optimal hyperparameters and architectural design choices for deep learning neural networks, enabling practitioners to maximize their model’s performance. Apart from performance, further metrics, such as latency or memory consumption, are to be optimized simultaneously, requiring a multi-objective view on the problem. This project aims to expand the current state-of-the-art in multi-objective HPO/NAS. A key focus will be on accelerating the optimization process by exploiting data collected on previous optimization tasks. This transfer-learning approach will model similarities between tasks to warm-start the multi-objective optimization process instead of starting it from scratch.

Time series analysis and reinforcement learning

Belongs to project: Artificial intelligence methods for modeling time series data

Supervisors: Prof. Dr. Ostap Okhrin, Dr. Josefine Umlauft

Location: Dresden

% of Position: 100%

Abstract: Background: In light of accelerated glacial retreat under changing climate conditions, a profound understanding of coupled processes within the cryosphere is inevitably needed for updating physical models and future predictions. Cryoseismology – the research field that focuses on the investigation of seismic waves from glaciers – offers great value in this regard but also faces a significant challenge: the signals of interest often get obscured by high levels of environmental noise from various sources. Specifically, basal sources such as stick-slip events frequently remain undetected due to their long travel paths to surface sensors and associated wave attenuation.
This issue prompts critical research questions that remain unanswered: Is subglacial stick-slip sliding a local phenomenon or does it impact the entire glacier, varying with surface melt and ice thickness? How do these events respond to changing meteorological conditions, particularly melt-induced surges? Can the distribution of stick-slip activity under different hydraulic conditions predict the stability and failure of steep ice tongues? Detecting stick-slip events is crucial for understanding glacier sliding dynamics, necessitating investigation across the entire glacier, from the ablation to accumulation zones17. To address these issues, new developments in statistical and machine learning, as discussed in the WP descriptions below, are required to deal with such high frequent noisy data.
Objective: Distributed Acoustic Sensing (DAS) offers a promising solution to the mentioned issues, using fiber-optic cables and an interrogation unit to measure strain rates18. This technology provides seismic data with exceptional spatial and temporal resolution, for example, across an entire Alpine glacier. Innovative techniques are required to efficiently denoise large cryo-seismological DAS datasets to effectively detect stick-slip events. Drawing from extensive experience in modeling time-series data in finance, economics, insurance, and transportation, our goal is to adapt similar methodologies for cryoseismology. The proposed work-plan is structured as follows:

Belongs to project: Artificial intelligence methods for modeling time series data

Supervisors: Prof. Dr. Markus Scholz, Prof. Dr. Ostap Okhrin

Location: Leipzig

% of Position: 65%

Abstract: Time-series data in biology are often sparse and noisy, which presents challenges for applying conventional AI models. Addressing these issues requires tailored AI architectures and transfer learning techniques. Our goal is to develop improved methods for analyzing sparse time-series data by incorporating biological knowledge into model design and transfer learning approaches.
We are seeking a candidate with a strong background in mathematics, physics, informatics, or a related field. Familiarity with machine learning or AI applications is preferred, as is an interest in working on biological problems. We offer a stimulating interdisciplinary working environment with opportunities for collaboration across fields such as mathematics and earth sciences. We look forward to your application.

Belongs to project: Artificial intelligence methods for modeling time series data

Supervisors: Prof. Dr. Miguel Mahecha, Prof. Dr. Markus Scholz

Location: Leipzig

& of Position: 75%

Abstract: Extreme climate events like heatwaves, heavy precipitation, and droughts are intensifying. These events profoundly impact vegetation health. Photosynthesis, a key process, is affected, impacting forestry and agriculture. Novel satellite data can detect these impacts as spatiotemporal anomalies. Current methods cannot detect 3D anomalies in such data streams in a weakly supervised manner. They also fail to attribute events to environmental factors like past weather, soil properties, or human ecosystem management. This project develops weakly supervised machine learning algorithms to detect and analyze these events. It uses explainable AI for attribution and enables predictions of future climate impacts.

Belongs to project: Explainable Reinforcement Learning Methods for Engineering

Supervisors: Prof. Dr. Ostap Okhrin, Dr. Patrick Ebel

Location: Dresden

% of Position: 100%

Abstract: User-centered design is expensive and time-consuming, requiring extensive user studies throughout the product development process. A potential solution to save time and money, and to make the evaluation process of digital devices more dynamic, are computational user models, also known as simulated users. By simulating aspects of human behavior relevant to product design, these models make it possible to predict how users will interact with new products and what problems they may encounter at an early stage of prototype development. As of now, most of the computational models used to evaluate digital products such as smartphones or in-vehicle infotainment systems are purely data-driven and rely on black-box machine learning models.
Although XAI approaches such as SHAP provide human-understandable explanations for these black-box models, they solely explain the output of the machine learning model. Thus, there is no guarantee that the model itself represents human behavior and also no guarantee that the explanations capture the real causal relationship between human behavior and interactive technology. One way to address this challenge is to model human behavior within the framework of computational rationality. Computational rational theories assume that users choose their behavior to maximize their expected utility, given their bounds. Applied to human-computer interaction, users interact with technology to achieve an optimal outcome given their internal (e.g., cognitive or perceptual) and external (e.g., environmental or tool design) bounds. The behavioral policies (i.e., the simulated behavior) can be approximated through reinforcement learning, yielding verifiable predictions of user interaction behavior.
Although machine learning is used to learn behavioral policies and make predictions, the bounds formulated (e.g., perceptual or motor constraints) are grounded in cognitive theory and limit the space of computable interaction strategies of the agent so that the interactions represent realistic human behavior. While the bounds defined for the internal environment allow us to exclude impossible behaviors, the only way to generate explanations for agent behavior is to run “what-if” simulations. This allows us to evaluate how behavior changes when we change human capabilities (internal bounds) or system design (external bounds).
However, this requires retraining and re-running the model with new configurations. Even if this process could be repeated infinitely fast and infinitely often, we would still not know why the model performed a particular action at a particular time. Thus, generating viable explanations for each action is of large value to better understand human behavior and decision making. In addition, current CR approaches still rely heavily on reward shaping (i.e., manually specifying the reward function to generate expected behavior and match human data). However, manually specifying the reward function is a step in the CR modeling process that is often criticized because it requires the modeler to define the goals of the agents. Here, inverse RL methods may be a promising way to learn more about human motivation by extracting the reward function from observed human behavior. 

Belongs to project: Explainable Reinforcement Learning Methods for Engineering

Supervisors: Prof. Dr. Frank, Cichos, Dr. Nico Scherf

Location: Leipzig

% of Position: 100%

Abstract: Join our PhD project applying actor-critic reinforcement learning to control hydrodynamic flow and temperature in microfluidic environments, enabling precise particle separation and the navigation of microrobotic systems. A central focus lies in explaining and visualizing neural activations and attention maps within the RL agents, revealing the physical principles behind optimized strategies. By building a real-time, interpretable RL pipeline that processes imaging data at high speed, we aim to optimize microfluidic flows, uncover how AI decisions align with physical phenomena, and push the boundaries of advanced control in drug discovery and beyond.

Visualization and causal inference

Belongs to project: AI-Augmented Data Stories: Generating Images and Text for Understanding and Explaining Large and Complex Scientific Data

Supervisors: Prof. Dr.-Ing. Raimund Dachselt, Prof. Dr. Gerik Scheuermann

Location: Dresden

% of Position: 100%

Abstract: This PhD project focuses on enhancing comprehension and engagement with complex scientific visualizations, particularly of climate and weather data, by developing automated explanations through data storytelling. In this collaborative project, a framework will be designed, created, and evaluated to produce tailored data stories for large scientific visualizations. Key tasks include investigating AI methods suitable for data storytelling and creating the framework for generating these narratives using explanations and visualizations from scientific datasets. Additionally, the project will explore the integration of these narratives with immersive technologies to provide interactive user experiences and assess the framework’s utility across diverse contexts and application areas.

Belongs to project: AI-Augmented Data Stories: Generating Images and Text for Understanding and Explaining Large and Complex Scientific Data

Supervisors: Prof. Dr. Gerik Scheuermann, Prof. Dr.-Ing. Raimund Dachselt

Location: Leipzig

% of Position: 100%

Abstract: This PhD project aims to develop AI-driven approaches for automatic text generation from complex scientific visualization results. These results include the spatio-temporal location of different phenomena, such as cyclones or atmospheric rivers, with specified properties of interest. The descriptive text which includes locations and various properties of phenomena should, in conjunction with the rendered visualizations, enable an intuitive understanding of complex scientific data. The developed system should be able to generate verbal descriptions tailored to a specific target audience. This project will focus on climate and weather data in the context of data storytelling.

Belongs to project: Fertilization, climate and biodiversity: visualisation, modelling, causal inference

Supervisors: Prof. Dr. Johannes Quaas, Prof. Dr. Gerik Scheuermann

Location: Leipzig

% of Position: 75%

Abstract: The proposed PhD project aims to develop AI methods oriented towards combined data from climate and biodiversity. Specifically, it will use data from biodiversity network observations aiming to identify the roles of land use practices on ecosystems and biodiversity, and combine it with climate and atmospheric data. The project will (a) perform a comprehensive visualisation and feature analysis, (b) model the ecosystem and biodiversity state parameters as function of drivers including the atmosphere, and (c) develop a causal inference approach to identify where and to which extent climate is responsible to ecosystem and biodiversity variability and changes. The project combines expertise from ScaDS.AI Leipzig and Dresden, from Meteorology and the biodiversity research centre iDiv

ML and ethics for protein design and chemical reactions

Belongs to project: Evaluation of artificial intelligence (AI) methods in protein design and their implications for ethical research in biomedicine

Supervisors: Jun.-Prof. Dr. Clara Scheoder

Location: Leipzig

% of Position: 75%

Abstract: This project will focus on cutting-edge AI tools in protein design, which are typically foundation models. In this project we want to investigate what these models have abstracted, and subsequently fine-tune them for specific protein design tasks, such as vaccine design. We are espcially interested in combining foundation models with specialized models for holistic design approaches.

Belongs to project: Evaluation of artificial intelligence (AI) methods in protein design and their implications for ethical research in biomedicine

Supervisors: Prof. Dr. Birte Platow, Dr. Hermann Diebel-Fischer

Location: Dresden

% of Position: 75%

Abstract: This second project investigates ethical and societal implications of AI in protein design. The PhD student will work closely together with computer scientists and protein designers, but subsequently work towards defining uncertainty measueres of these models, define dual use cases, and together with the other student, investiagte the epistimeological process of scientists working with AI biodesign tools.

Belongs to project: Learning to Explain: Inference of Chemical Reaction Mechanisms

Supervisors: Prof. Dr. Peter Stadler, Jun.-Prof. Julia Westermayr

Location: Leipzig

% of Position: 75%

Abstract: Chemical reactions can be understood a step-wise transition from reactant to educt structures in which individual bonds are formed and broken consecutive steps, each involving or more or serveral concerted transitions. In chemical terms, these steps are described as “electron punish”. The project aims to develop a framework to predict such electron pushing mechanisms. A difficulty is the absence of large-scale training data. Instead, theoretical consideration as well as quantum- chemical simulations will be employed. Conversely, predicted electron-pushing mechanisms are to be used to guide atom-level spatiotemporal simulations of chemical reactions.

Belongs to project: Learning to Explain: Inference of Chemical Reaction Mechanisms

Supervisors:  Jun.-Prof. Julia Westermayr, Prof. Dr. Peter Stadler

Location: Leipzig

% of Position: 75%

Abstract: This project focuses on using advanced machine learning (ML) techniques to understand and predict chemical reactions. Specifically, it combines a computer model that can propose potential reaction steps with a reinforcement learning (RL) system—a type of AI that learns through trial and error, much like a game. The system gets better over time by learning from its mistakes, improving its ability to predict how chemicals react with high accuracy. This method will help identify the most likely reaction pathways, even for complex and light-driven reactions. In addition, the work will emphasize interpretable and explainable ML solutions to bridge theoretical insights with practical applications in reaction mechanism discovery.

Belongs to project:Deep learning of protein-ligand interaction fingerprints based on functional atom matching for applications in drug discovery and protein design

Supervisors: Prof. Dr. Michael Schroeder, Dr. Georg Künze

Location: Dresden

 % of Position: 75%

Abstract: Focus on algorithms and drug discovery. Based in Dresden. Molecular fingerprints represent a cornerstone of drug discovery and design. They are widely used for fast screening of ligand molecules that can bind to a target protein. Molecular fingerprints can also serve as a guide for protein engineering, aiming to enhance a protein’s binding affinity for specific molecules. Although drug discovery and protein design have distinct perspectives, they share the same objective of optimizing the underlying molecular interactions.
This project aims to develop a general molecular design framework based on deep learning of molecular interaction fingerprints to provide a new set of AI tools for molecular engineering. The representation of protein-ligand interactions will be based on pharmacophore features, which capture chemical complementarity and specificity. Our AI framework will integrate ligand-centric and protein-centric learning algorithms, making it suitable for a versatile set of applications including prediction of molecular properties as well as drug and protein design. By training and refining our model on comprehensive datasets of protein-ligand interactions, we will explore its potential to advance state-of-the-art drug and protein design methodologies. We anticipate that our conceptual framework will be useful for many tasks, e.g., virtual drug screening or sourcing of enzymes for green chemistry processes. This Leipzig-Dresden collaboration will also provide essential input to student training in AI, including the only mandatory bachelor AI course, Intelligente Systeme, in Dresden.

Belongs to project:Deep learning of protein-ligand interaction fingerprints based on functional atom matching for applications in drug discovery and protein design

Supervisors: Dr. Georg Künze, Prof. Dr. Michael Schroeder

Location: Leipzig

 % of Position: 75%

Abstract:Focus protein design and modelling. Based in Leipzig. Molecular fingerprints represent a cornerstone of drug discovery and design. They are widely used for fast screening of ligand molecules that can bind to a target protein. Molecular fingerprints can also serve as a guide for protein engineering, aiming to enhance a protein’s binding affinity for specific molecules. Although drug discovery and protein design have distinct perspectives, they share the same objective of optimizing the underlying molecular interactions.
This project aims to develop a general molecular design framework based on deep learning of molecular interaction fingerprints to provide a new set of AI tools for molecular engineering. The representation of protein-ligand interactions will be based on pharmacophore features, which capture chemical complementarity and specificity. Our AI framework will integrate ligand-centric and protein-centric learning algorithms, making it suitable for a versatile set of applications including prediction of molecular properties as well as drug and protein design. By training and refining our model on comprehensive datasets of protein-ligand interactions, we will explore its potential to advance state-of-the-art drug and protein design methodologies. We anticipate that our conceptual framework will be useful for many tasks, e.g., virtual drug screening or sourcing of enzymes for green chemistry processes. This Leipzig-Dresden collaboration will also provide essential input to student training in AI, including the only mandatory bachelor AI course, Intelligente Systeme, in Dresden.

funded by:
Gefördert vom Bundesministerium für Bildung und Forschung.
Gefördert vom Freistaat Sachsen.