M.Sc. Thesis: Benchmarking of federated learning tools for genetic data analysis

Type of thesis: Masterarbeit / location: Leipzig / Status of thesis: Open theses

Federated Learning (FL) is a method to perform the training of machine learning algorithms on multiple decentralized edge devices or servers. For that, local data subsets are used without any data exchange between the nodes. After training the weights of the local models are shared to a common trusted server to optimize a central model. The central model is sent back to the edge devices where it can be used. In this way, machine learning can be performed without having to share sensitive raw data. On private data, such as genetic data, this technique can be rather useful. Human genomic data is the blueprint for each individual encoded in DNA and thus one the most private information about human beings themselves as well as their relatives.

In this interdisciplinary the benchmarking of open source tools for federated learning for genetic and clinical data analysis is to be done. It will study the trade offs between computational resource requirements and privacy as well as the accuracy of federated learning in comparison to non-federated learning.

Examples of federated learning tools: