At the 10th International Summer School on AI and Big Data, Dr. Stanislav Mazurenko will talk about Protein Representations for Efficient Machine Learning in Biology.
Protein engineering is undergoing a transformational shift due to the emergence of machine learning-based solutions to long-standing problems, such as predicting protein structures and function from sequence, evaluating the effects of mutations on various protein properties, and obtaining biological insights from molecular dynamics simulations. Despite recent progress in the domain, the question of an optimal representation of the available biological data remains largely open, and various approaches are being exploited and developed. This talk will explore the many ways of translating protein data into machine learning-friendly representations. We will cover a wide range of representations, from intuitive physicochemical features to more abstract embeddings, graphs, and Koopman operators, and discuss their advantages and disadvantages. Importantly, we will analyze them from the application point of view, including several recent examples motivated by problems in biochemistry, biotechnology, and medicine.
Stanislav Mazurenko earned his Ph.D. in applied mathematics and cybernetics from Lomonosov Moscow State University in 2013. In 2014, he joined the Protein Engineering group at Loschmidt Laboratories as a postdoc to work on protein thermal denaturation. Protein thermostability is typically assessed by calorimetric or spectroscopic techniques, which, while extremely powerful, produce data that must be deconvoluted before proper interpretation. Dr. Mazurenko developed the web-based software CalFitter for global data analysis and quantification of energy changes during protein unfolding, accessed by over 9000 users to date. In 2018, Stanislav completed a one-year postdoc at the University of Liverpool, the UK, working on advanced nonlinear optimization methods. In particular, he suggested several primal-dual algorithms and proved their convergence for highly challenging nonconvex problems.
After being awarded the national Marie Skłodowska-Curie Actions @MUNI grant, Dr. Mazurenko returned to Brno to establish his own research team focused on machine learning, automation, and data analysis in protein engineering. The team’s goal is to delve into data, such as experimental measurements, protein sequences, or computer simulations, to gain insights into the underlying biophysical mechanisms and create reliable and interpretable tools for designing improved protein variants. Those enhanced variants can then be used in numerous applications, from healthcare to the food industry and bio-degradation. Among other things, the team’s research has resulted in two highly cited Perspectives in ACS Catalysis, the PredictONCO web tool for precision oncology, the FireProtDB database of curated protein stability data, and the first-of-the-kind SoluProtMutDB database of curated protein solubility data.
In 2020, Dr. Mazurenko received the GAMU MUNI Scientist Award for significant research results at Masaryk University. In 2023, he was invited to serve as guest editor for the special issue on AI for Synthetic Biology for the ACS Synthetic Biology journal. He now teaches several courses for different study programs at the Faculty of Science of Masaryk University, including AI in Biology and Bioengineering, Molecular Biotechnology, and Synthetic Biology. He is a tutor at the annual workshop Hands-on Computational Enzyme Design in Loschmidt Laboratories. His current team includes one postdoc, five PhD students, and several undergraduate students.
For more information, visit stasmazurenko.com.