JavaScript is required to use this site. Please enable JavaScript in your browser settings.

Contact

Compressing Large Language Models

On the 06.03.2025 at 11:00 a.m., the 30th lecture of the Living Lab lecture series will take place. In this talk, Aaron Klein (Leipzig University) will talk about Compressing Large Language Models.

Compressing Large Language Models

Large Language Models (LLMs) mark a new era in Artificial Intelligence. However, their large size poses significant challenges for inference in real-world applications due to substantial GPU memory requirements and high inference latency.

In this talk, we discuss techniques to compress pre-trained LLMs, reducing their resource consumption during inference while maintaining their performance. More specifically, we approach the problem from a multi-objective Neural Architecture Search (NAS) perspective to jointly optimize performance and efficiency.

By considering the LLM as a super-network consisting of a large but finite number of sub-networks, we can identify a set of Pareto-optimal sub-networks that balance parameter count and validation performance. We empirically demonstrate that using NAS techniques for fine-tuning enhances the prunability of pre-trained LLMs and explore how this impacts real-world applications.

Living Lab Lecture Series

The Living Lab Lecture Series gives you an in-depth insight into the many research topics of ScaDS.AI Dresden/Leipzig. From Natural Language Processing to Ethics and Moral Code in AI, a great variety of topics are discussed. You can join our lectures every first thursday of the month or watch them on YouTube afterwards. If you have ideas for topics to discuss in the future, please let our Living Lab team know. We suggest for you to regularly check our event calendar, to never miss out on upcoming lectures or other interesting events organized by or in cooperation with our center.

FAQ

You can reach the permanent room for all lectures here: https://tud.link/i8zf

The room will be accessible 5 minutes before the start of the lecture.

The participation is free for everyone.

No! Not at all. One of our goals in the Living Lab lecture series is to familiarize everyone with these topics.

You just need an up-to-date browser such as Firefox, Google Chrome or Chromium. We would also recommend using headphones for better audio quality.

Not unless you would like to! In general, there is no need to have a camera or microphone to participate in the lecture.

Alongside joining the discussion with your camera and microphone, there is also the possibility to submit your question and comments as a written comment in the Chat section.

funded by:
Gefördert vom Bundesministerium für Bildung und Forschung.
Gefördert vom Freistaat Sachsen.