Home // LLM Inference Acceleration Techniques: A Literature Review
Type of thesis: Bachelorarbeit / location: Dresden / Status of thesis: Open theses
Large Language models based on Transformers are a rapidly evolving field. With the increasing amount of pretrained models, the need for efficient inference methods grows. However, inference still lacks from underutilized hardware and yet a too slow generation of tokens on small amounts of GPUs to make it suitable for large amounts of data or real-time applications. To overcome those issues, several inference acceleration methods and frameworks arised in the last years. Yet, the performance comparability of those is still an open topic. Therefore, the goal of the research project is to review and evaluate the field of inference acceleration literature to create a survey still missing in the LLM community.
The topic can be worked on as a Bachelor’s thesis or research project.
TU Dresden
GPT-X, Natural Language Processing
ScaDS.AI Dresden/Leipzig (Center for Scalable Data Analytics and Artificial Intelligence) is a center for Data Science, Artificial Intelligence and Big Data with locations in Dresden and Leipzig.
Technische Universität Dresden ScaDS.AI Dresden/Leipzig Bürogebäude Strehlener Straße Strehlener Straße 12, 14 01069 Dresden
Löhrs Carré Humboldtstraße 25, 3. Obergeschoss 04105 Leipzig Postal address Leipzig: Universität Leipzig Data Science Zentrum Internes Postfach: 212104 04081 Leipzig
Copyright 2024 © ScaDS.AI Dresden/Leipzig – All rights reserved.