Status: open / Type of Theses: Master theses / Location: Leipzig
Various types and architectures of autoencoders have shown promising and fascinating results for multi-modal data integration, embedding, and representation learning for large-scale molecular genetic data. To make it easily available for users, we established the framework AUTOENCODIX [1]. Hyperparameter optimization is often critical to achieve a sensible performance with deep learning architectures, such as autoencoders, but common methods and tools often fall short in this setting because of the computational demands of such large-scale applications. Commonly used hyperparameter optimization approaches optimize hyperparameters of deep neural networks in isolation without exploiting knowledge from previous optimization runs. In contrast, human experts usually develop a good mental model of hyperparameters over time. In this thesis the student should explore a meta-learning approach to warm-start the optimization process by learning from previous tasks. To this end, training data for such meta models will be simulated by the training of a plethora autoencoders, data sets and hyperparameters using our framework AUTOENCODIX. Finally, different meta-modelling strategies for autoencoders should be evaluated and benchmarked against approaches implemented in open-source libraries, such as Optuna or Syne Tune[3].
Student profile:
Ideally, the thesis candidate already has some experience in Python, PyTorch and machine learning. Experience with biomedical data is helpful but not necessary.
References
[1] https://github.com/jan-forest/autoencodix
[2] https://github.com/optuna/optuna
[3] https://github.com/syne-tune/syne-tune