Home // About us // Junior Research Groups // Automated Large Language Models

Lead

Aaron Klein

Leipzig University

aaron.klein@uni-leipzig.de

Automated Large Language Models

Large language models represent a milestone in AI and are the research basis of the junior research group “Automated Large Language Models”, which develops data-driven methods that automate manual workflows in the training and use of large language models (LLMs) and thus reduce their computational effort.

Therefore, this junior research group is focusing on following key topics:

Hyperparameter optimization
Model compression
Efficient pre-training

Projects

Resource efficient inference: While LLMs deliver impressive capabilities, their immense size introduces practical challenges, including high inference costs, significant memory requirements, and elevated latencies, particularly in real-time or resource-constrained environments like mobile devices and embedded systems. Moreover, deploying LLMs at scale leads to substantial operational expenses, hindering their widespread adoption.
To mitigate these issues, the group develops efficient model compression techniques to reduce parameter counts while preserving high downstream task performance in specific domains.

Find out more about model compression in the corresponding open-source framework whittle.

Accelerating Pre-Training: Pre-training LLMs demands an extraordinary amount of computational resources, making it challenging for research labs without industry-scale compute infrastructure to train their own models. The group’s objective is to design methods that stabilize and warm-start the pre-training process, enabling more efficient training and reducing the associated compute costs.

Hyperparameter Optimization: The effectiveness of pre-training or fine-tuning heavily relies on correctly setting the hyperparameters governing the optimization process. However, current HPO techniques often require multiple iterations of training and validation on a full-scale model, making them impractical for large LLMs. The group develops new hyperparameter optimization techniques without requiring a single training or validation run of the full model, thereby drastically reducing the computational overhead of HPO.

Test our open-source library syne-tune for large-scale HPO.

Team

Lead

The junior research group leader, Aaron Klein, started at ScaDS.AI Leipzig in July 1st 2024. At the University of Freiburg, Klein studied Computer Science for his Master’s, and started his PhD in Machine Learning.

Previously, Klein worked as an Applied Scientist at Amazon Web Services. His research area focuses on Automated Machine Learning and Large Language Models. So far, Klein’s scientific achievements include

Autosklearn – automated machine learning framework on top of sklearn,
the development of efficient methods for model-based hyperparameter optimization,
his work on neural architecture search,
and the Co-Organization on several workshops and the AutoML Seminar.

More information about him, his publications, and his work can be found on his personal website.

Aaron Klein

Leipzig University

Team Members

Hannan Mahadik

Leipzig University

Institute of Computer Science

Publications

By now, no publications of the group exist.

funded by:

Gefördert vom Bundesministerium für Bildung und Forschung.

ScaDS.AI Dresden/Leipzig (Center for Scalable Data Analytics and Artificial Intelligence) is a center for Data Science, Artificial Intelligence and Big Data with locations in Dresden and Leipzig.