JavaScript is required to use this site. Please enable JavaScript in your browser settings.

Contact

Next Generation Computer-aided Drug Discovery for Small Molecules

Title: Next Generation Computer-aided Drug Discovery for Small Molecules

Project duration: 05/21-05/25

Research Area: Life Science and Medicine

The field of computer-aided drug discovery is undergoing a transformation. With the availability of protein structures, expansive chemical spaces, and advancements in geometric deep learning, we are poised to make significant strides. However, several challenges remain.

We developed a novel method to screen ultra-large chemical spaces containing billions of molecules. Our approach, known as RosettaEvolutionaryLigand, successfully addressed this challenge. During development, we identified two key issues. First, the docking algorithm we used, while precise, is computationally expensive. Preliminary experiments showed that combining advanced diffusion methods with Monte Carlo sampling improves on both techniques. Second, existing affinity prediction methods lack precision, which is crucial for drug discovery but often overlooked by the machine learning community.

Image of several molecular structures with overlapping colored shapes and numerical values indicating different properties.

Aims

  • Screen chemical spaces containing billions of molecules.
  • Enhance docking methods in terms of precision and speed.
  • Develop robust affinity prediction to provide reliable criteria for selecting candidates for in-vitro testing.

Problems

Developing new small molecule pharmaceuticals is challenging due to the vast chemical space. Computer calculations help preselect candidates, reducing the number of tests and iteration cycles needed to discover robust initial hits. However, established methods have significant shortcomings, and new machine learning methods often rely on artificial benchmarks due to limited domain knowledge.

Practical Example

We participated in the first round of the CACHE challenge, predicting a potential drug candidate for a protein associated with Parkinson’s disease. We discovered five promising hits from 145 submitted molecules. This demonstrated the potential of our evolutionary algorithm while also highlighting the limitations of our docking protocol and scoring function.

Technology

We use PyTorch for machine learning and the e3nn library for tensor product computations in molecular systems. RDKit prepares inputs for small molecules, while Rosetta handles large biomolecules. Our development takes place in C++ and Python, leveraging high-performance compute clusters with OpenMPI and strong GPUs with CUDA.

Outlook

We plan to finalize state-of-the-art docking protocols and develop local geometry-aware convolution kernels for interaction predictions between molecules. Future work will focus on integrating these kernels into deep neural networks for affinity prediction. Additionally, we aim to create an affinity predictor that doesn’t rely on complex structures and develop a foundation model for molecular feature prediction that can adapt to multi-modal settings and specific tasks.

Publications

  • REvoLd: Ultra-Large Library Screening with an Evolutionary Algorithm in Rosetta [in preparation]
  • Utilizing Ultra-large library screening in Rosetta: A Case Report for Novel Binder of the WD-Repeat Domain of Leucine-Rich Repeat Kinase 2 [in preparation]

Team

Lead

Team Members

funded by:
Gefördert vom Bundesministerium für Bildung und Forschung.
Gefördert vom Freistaat Sachsen.